擅长:python、mysql、java
<p>找到第一个标题并在<a href="http://www.crummy.com/software/BeautifulSoup/bs4/doc/#next-siblings-and-previous-siblings" rel="nofollow">^{<cd1>}</a>上迭代,直到找到另一个标题:</p>
<pre><code>from bs4 import BeautifulSoup
data = """
<div class="left_panel">
<h4>Header1</h4>
block of text that I want.
<br />
<br />
another block of text that I want.
<br />
<br />
still more text that I want.
<br />
<br />
<p>&nbsp;</p>
<h4>Header2</h4>
</div>
"""
soup = BeautifulSoup(data)
header1 = soup.find('h4', text='Header1')
for item in header1.next_siblings:
if getattr(item, 'name') == 'h4' and item.text == 'Header2':
break
print item
</code></pre>
<hr/>
<p>更新(收集两个<code>h4</code>标记之间的文本):</p>
^{pr2}$