Python网络 - 问答 - Python中文网

Python网络

2024-05-23 16:47:23 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

是否有任何python爬虫从一个网页中提取ex:http://www.bestbuy.com/site/HTC+-+One+S+4G+Mobile+Phone+-+Gradient+Blue+%28T-Mobile%29/4980512.p?id=1218587135819&skuId=4980512&contract_desc=的所有数据在这个页面中，客户评论有两个页面1和2。我想抓取他的url并获取这两个页面的内容。使用python爬虫是否可能。

python crawler是否也支持所有现代GET/POST技术

Tags： com http 网页 www site phone blue 页面

2条回答

网友

1楼 · 编辑于 2024-05-23 16:47:23

如果要对站点进行爬网，请参见this post。如果您只想处理一些页面并分析其内容（意味着您知道要处理的url），请尝试BeautifulSoup，它允许您执行以下操作：

page = urllib2.urlopen(url)
soup = BeautifulSoup(page.read())
for f in soup.findAll('form'):
    target_url = f['action']
    #do something with each one of the forms

网友

2楼 · 编辑于 2024-05-23 16:47:23

您可以使用Scrapy：

Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

相关问题更多 >

编程相关推荐

热门问题

热门文章