如何在python中获取div tag中存在的标记？

<div id="main_category"> <div class="tit1"><a href="#" onclick="ExpandStage(1);"><strong>Phase 1</strong><br />April 15 - 19</a></div> <ul id="phase1"> <li><a href="expexhibitorlist.aspx?categoryno=411">Consumer Electronics and Information Products</a></li> <li><a href="expexhibitorlist.aspx?categoryno=412">Electronic and Electrical Products</a></li>

from bs4 import BeautifulSoup import re import urllib.request r = urllib.request.urlopen('http://i.cantonfair.org.cn/en/expexhibitorlist.aspx?categoryno=410').read() soup = BeautifulSoup(r, "html.parser") letters = soup.find_all("div",{"id":"main_category"}) for element in letters: categories = element.a.get_text() print (categories)

1条回答

网友

1楼 · 发布于 2024-04-25 21:54:10

我使用的是Python2.7，下面的代码对我很有用。python3可以使用相同的方法。希望能有帮助：

from bs4 import BeautifulSoup as bs
from urllib2 import urlopen
r = urlopen('http://i.cantonfair.org.cn/en/expexhibitorlist.aspx?categoryno=410').read()
soup = bs(r, "lxml")
lis = soup.find_all("li")
hrefs = [c.a['href'] for c in lis]
print hrefs

相关问题更多 >

编程相关推荐

热门问题

热门文章