美化组添加<div>,类位于html末尾

2024-05-23 17:08:22 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个带有多个标记的HTML文件(div中也有多个div)。我想在HTML末尾的特定位置添加一个新的标记和类。我尝试了appendinsertinsert\u after/insert\u before,但是,它没有像我预期的那样工作

我的html输入是:

   <div id="page">
   <div id="records">
   <div class="record">
   <div class="header">
   <div class="title">
    Something here to display
   </div>
   </div>
   <div class="disclaimer">
   <p>Here i want to print content</p>
   </div>
   </div>
   <div class="record">
   <div class="header">
   <div class="title">
    Something here to display again once
   </div>
   </div>
   <div class="disclaimer">
   <p>Here i want to print content again once</p>
   </div>
   </div>
   <div class="record">
   <div class="header">
   <div class="title">
    Something here to display second time
   </div>
   </div>
   <div class="disclaimer">
   <p>Here i want to print content second time</p>
   </div>
   </div>
</div>
</div>

我想在<div id="records">的结束标记之前添加一个新的<div>标记,该标记末尾带有class="record"

输出如下所示:

   <div id="page">
   <div id="records">
   <div class="record">
   <div class="header">
   <div class="title">
    Something here to display
   </div>
   </div>
   <div class="disclaimer">
   <p>Here i want to print content</p>
   </div>
   </div>
   <div class="record">
   <div class="header">
   <div class="title">
    Something here to display again once
   </div>
   </div>
   <div class="disclaimer">
   <p>Here i want to print content again once</p>
   </div>
   </div>
   <div class="record">
   <div class="header">
   <div class="title">
    Something here to display second time
   </div>
   </div>
   <div class="disclaimer">
   <p>Here i want to print content second time</p>
   </div>
   </div>
   <div class="record">
   <div class="header">
   <div class="title">
    Something here to display 3rd time
   </div>
   </div>
   <div class="disclaimer">
   <p>Here i want to print content 3rd time</p>
   </div>
   </div>
</div>
</div>

在我的例子中,<div class="record">的数量不是固定的,这个数字可能总是不同的

我想使用python中的BeautifulSoup获得解决此问题的方案/建议


Tags: todivheretimetitledisplaycontentrecord
2条回答

您可以在soup.find_all('div', class_='record')中的最后一项之后使用insert_after

from bs4 import BeautifulSoup

html = '<div id="records"> <div class="record"> <div class="header"> <div class="title"> Something here to display </div> </div> <div class="disclaimer"> <p>Here i want to print content</p> </div> </div> <div class="record"> <div class="header"> <div class="title"> Something here to display again once </div> </div> <div class="disclaimer"> <p>Here i want to print content again once</p> </div> </div> <div class="record"> <div class="header"> <div class="title"> Something here to display second time </div> </div> <div class="disclaimer"> <p>Here i want to print content second time</p> </div> </div> </div>'

soup = BeautifulSoup(html, 'html.parser')

extra_html = '''
<div class="record">
    <div class="header">
        <div class="title">
            Something here to display 3rd time
        </div>
    </div>
    <div class="disclaimer">
        <p>Here i want to print content 3rd time</p>
    </div>
</div>'''

soup.find_all('div', class_='record')[-1].insert_after(BeautifulSoup(extra_html, 'html.parser')) # [-1] selects the last item

输出print(soup.prettify())

<div id="records">
 <div class="record">
  <div class="header">
   <div class="title">
    Something here to display
   </div>
  </div>
  <div class="disclaimer">
   <p>
    Here i want to print content
   </p>
  </div>
 </div>
 <div class="record">
  <div class="header">
   <div class="title">
    Something here to display again once
   </div>
  </div>
  <div class="disclaimer">
   <p>
    Here i want to print content again once
   </p>
  </div>
 </div>
 <div class="record">
  <div class="header">
   <div class="title">
    Something here to display second time
   </div>
  </div>
  <div class="disclaimer">
   <p>
    Here i want to print content second time
   </p>
  </div>
 </div>
 <div class="record">
  <div class="header">
   <div class="title">
    Something here to display 3rd time
   </div>
  </div>
  <div class="disclaimer">
   <p>
    Here i want to print content 3rd time
   </p>
  </div>
 </div>
</div>

使用.append(),它需要选择父元素或<div id="page">

newRecord = '''
<div class="record">
  <div class="header">
    <div class="title">
      Something here to display 3rd time
    </div>
  </div>
  <div class="disclaimer">
    <p>Here i want to print content 3rd time</p>
  </div>
</div>
'''

soup = BeautifulSoup(sourceHTML, 'html.parser')
page = soup.select_one('#page')
page.append(BeautifulSoup(newRecord, 'html.parser'))
print(soup.prettify())

相关问题 更多 >