如何在python-beautifulsoup中获取交替的子标记

2条回答

网友

1楼 · 编辑于 2024-04-25 06:51:30

有很多方法可以做到这一点，但对我来说最简单的方法是选择所有h3标记，然后遍历DOM以获得它们的下一个兄弟。你知道吗

网友

2楼 · 编辑于 2024-04-25 06:51:30

找到所有标题，然后从那里获取next sibling：

for header in soup.select('div h3'):
    next_div = header.find_next_sibling('div')

element.find_next_sibling()返回一个元素，如果找不到这样的同级元素，则返回None。你知道吗

演示：

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('''\
... <div>
...     <h3>First header</h3>
...     <div>First div to go with a header</div>
...     <h3>Second header</h3>
...     <div>Second div to go with a header</div>
... </div>
... ''')
>>> for header in soup.select('div h3'):
...     next_div = header.find_next_sibling('div')
...     print(header.text, next_div.text)
... 
First header First div to go with a header
Second header Second div to go with a header

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何在python-beautifulsoup中获取交替的子标记

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >