使用SoupStrainer有选择地解析

from BeautifulSoup import BeautifulSoup import urllib import re url = "Some Shopping Site" html = urllib.urlopen(url).read() soup = BeautifulSoup(html) for a in soup.findAll('a',{'title':re.compile('.+') }): print a.string

2条回答

网友

1楼 · 编辑于 2024-06-12 16:53:20

哦，天哪，我真傻，我在找atribute id=产品的标签，但应该是产品清单

如果有人来搜索，这是最后的代码。

from BeautifulSoup import BeautifulSoup, SoupStrainer
import urllib
import re


start = time.clock()
url = "http://someplace.com"
html = urllib.urlopen(url).read()
product = SoupStrainer('div',{'id': 'products_list'})
soup = BeautifulSoup(html,parseOnlyThese=product)
for a in soup.findAll('a',{'title':re.compile('.+') }):
      print a.string

网友

2楼 · 编辑于 2024-06-12 16:53:20

尝试先搜索产品列表div，然后搜索具有标题的a标记：

product = soup.find('div',{'id': 'products'})
for a in product.findAll('a',{'title': re.compile('.+') }):
   print a.string

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用SoupStrainer有选择地解析

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >