带有scrapy的while循环中的ReactorNotRestartable错误

2024-06-11 15:43:18 发布

您现在位置:Python中文网/ 问答频道 /正文

当我执行以下代码时,出现twisted.internet.error.ReactorNotRestartable错误:

from time import sleep
from scrapy import signals
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings
from scrapy.xlib.pydispatch import dispatcher

result = None

def set_result(item):
    result = item

while True:
    process = CrawlerProcess(get_project_settings())
    dispatcher.connect(set_result, signals.item_scraped)

    process.crawl('my_spider')
    process.start()

    if result:
        break
    sleep(3)

这是第一次,然后我得到错误。我每次都创建process变量,那么问题是什么?


Tags: fromimportprojectgetsettings错误sleepresult
3条回答

我能像这样解决这个问题。process.start()只应调用一次。

from time import sleep
from scrapy import signals
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings
from scrapy.xlib.pydispatch import dispatcher

result = None

def set_result(item):
    result = item

while True:
    process = CrawlerProcess(get_project_settings())
    dispatcher.connect(set_result, signals.item_scraped)

    process.crawl('my_spider')

process.start()

默认情况下,^{}^{}将在所有爬虫程序完成后停止它创建的扭曲反应器。

如果在每次迭代中创建process,则应该调用process.start(stop_after_crawl=False)

另一个选择是自己处理扭曲的反应器并使用^{}The docs have an example就这么做。

参考号http://crawl.blog/scrapy-loop/

 import scrapy
 from scrapy.crawler import CrawlerProcess
 from scrapy.utils.project import get_project_settings     
 from twisted.internet import reactor
 from twisted.internet.task import deferLater

 def sleep(self, *args, seconds):
    """Non blocking sleep callback"""
    return deferLater(reactor, seconds, lambda: None)

 process = CrawlerProcess(get_project_settings())

 def _crawl(result, spider):
    deferred = process.crawl(spider)
    deferred.addCallback(lambda results: print('waiting 100 seconds before 
    restart...'))
    deferred.addCallback(sleep, seconds=100)
    deferred.addCallback(_crawl, spider)
    return deferred


_crawl(None, MySpider)
process.start()

相关问题 更多 >