擅长:python、mysql、java
<p>不要使用正则表达式。使用HTML解析器<a href="http://www.crummy.com/software/BeautifulSoup/" rel="noreferrer">BeautfulSoup</a>。</p>
<pre><code>from BeautifulSoup import BeautifulSoup
html = \
"""
<div id=hotlinklist>
<a href="foo1.com">Foo1</a>
<div id=hotlink>
<a href="/">Home</a>
</div>
<div id=hotlink>
<a href="/extract">Extract</a>
</div>
<div id=hotlink>
<a href="/sitemap">Sitemap</a>
</div>
</div>"""
soup = BeautifulSoup(html)
soup.findAll("div",id="hotlink")[2].a
# <a href="/sitemap">Sitemap</a>
</code></pre>