使用BeautifulSoup找到一个div,以便我可以对其进行解析(decompose);但是它会影响标记中的Golang代码。为什么?

2024-03-29 14:03:55 发布

您现在位置:Python中文网/ 问答频道 /正文

我在Hugo模板中打开一个html文件,在beautifulSoup中运行它,并使用它找到一个特定的div。然后我使用.decompose()除去整个标记,并将其包装在str()中,这样我就可以将它写回文件,但它删除了其他字符。你知道吗

print(soup)向我展示了混乱的输出

homeHtml = 'layouts/index.html'

with open(homeHtml) as f:
  soup = BeautifulSoup(f, 'html.parser')
  removeDiv = soup.find("div", {'class': 'removeMe'})
  removeDiv.decompose()
  myText = str(soup)
with open(homeHtml, 'w') as f:
  f.write(myText)

打印输出(soup)--

{{ define "main" }}
<div class="hero tall">
<div class="container">
<h1>{{ i18n "home_hero_title" | safeHTML }}</h1>
<div class="buttons flex">
<div class="flex1 tar mr1m">
<a " class="dib" getting-started href="{{ " rellangurl | }}">{{ i18n "home_hero_getting_started" | safeHTML }}</a>
</div>
<div class="flex1 tal ml1m">
<a " become-a-sponsor class="dib" href="{{ " rellangurl | }}">{{ i18n "home_hero_sponsor" | safeHTML }}</a>
</div>
</div>
</div>
</div>
<div class="page-content">
<div class="wrapper">
<div class="">

</div>
</div>
</div>
{{ end }}

它弄乱了一些标记:

<div class="flex1 tal ml1m">
<a " become-a-sponsor class="dib" href="{{ " rellangurl | }}">{{ i18n "home_hero_sponsor" | safeHTML }}</a>

Tags: 文件divhomehtmlclassi18nhrefsoup