擅长:python、mysql、java
<p><a href="http://doc.scrapy.org/en/latest/topics/spider-middleware.html#scrapy.contrib.spidermiddleware.offsite.OffsiteMiddleware" rel="nofollow">OffsiteMiddleware</a>是您应该考虑使用的:</p>
<blockquote>
<p><code>class scrapy.contrib.spidermiddleware.offsite.OffsiteMiddleware</code></p>
<p>Filters out Requests for URLs outside the domains covered by the
spider.</p>
<p>This middleware filters out every request whose host names aren’t in
the spider’s allowed_domains attribute.</p>
</blockquote>