擅长:python、mysql、java
<p>实际上,您可以创建新的请求对象来爬网由sitemapsider创建的url,并使用新的回调解析响应:</p>
<pre><code>class MySpider(SitemapSpider):
name = "xyz"
allowed_domains = ["xyz.nl"]
sitemap_urls = ["http://www.xyz.nl/sitemap.xml"]
def parse(self, response):
print response.url
return Request(response.url, callback=self.parse_sitemap_url)
def parse_sitemap_url(self, response):
# do stuff with your sitemap links
</code></pre>