到目前为止我还可以。我的问题是,我如何才能进一步刮这个网址列表?在搜索之后,我知道我可以在解析中返回一个请求,但它似乎只能处理一个URL。
这是我的分析:
def parse(self, response):
# Get the list of URLs, for example:
list = ["http://a.com", "http://b.com", "http://c.com"]
return scrapy.Request(list[0])
# It works, but how can I continue b.com and c.com?
我可以这样做吗?
def parse(self, response):
# Get the list of URLs, for example:
list = ["http://a.com", "http://b.com", "http://c.com"]
for link in list:
scrapy.Request(link)
# This is wrong, though I need something like this
完整版本:
import scrapy
class MySpider(scrapy.Spider):
name = "mySpider"
allowed_domains = ["x.com"]
start_urls = ["http://x.com"]
def parse(self, response):
# Get the list of URLs, for example:
list = ["http://a.com", "http://b.com", "http://c.com"]
for link in list:
scrapy.Request(link)
# This is wrong, though I need something like this
目前没有回答
相关问题 更多 >
编程相关推荐