class MySpider(BaseSpider):
def make_requests_from_url(self, url):
return Request(url, dont_filter=True, meta={'start_url': url})
def parse(self, response):
if response.meta['start_url'] == '???' and response.meta['depth'] > 10:
# do something here for exceeding limit for this start url
else:
# find links and yield requests for them with passing the start url
yield Request(other_url, meta={'start_url': response.meta['start_url']})
抱歉,看来我一开始就没听懂你的问题。更正我的答案:
响应在
meta
中有depth
键。你可以检查一下并采取适当的行动。在http://doc.scrapy.org/en/0.12/topics/spiders.html#scrapy.spider.BaseSpider.make_requests_from_url
相关问题 更多 >
编程相关推荐