无法使用cx_Freeze制作独立的scrapy爬虫二进制文件

5 投票
1 回答
1943 浏览
提问于 2025-04-18 00:32

我简单介绍一下我的工作环境:Windows 7 64位,Python 2.7 64位,Scrapy 0.22,cx_Freeze 4.3.2。

首先,我开发了一个简单的爬虫,它运行得很好。然后,我使用Scrapy的核心API创建了一个外部脚本main.py,这个脚本可以运行爬虫,并且也按要求工作。以下是这个脚本的代码:

# external main.py using scrapy core API, 'test' is just replaced name of my project
from twisted.internet import reactor
from scrapy.crawler import Crawler
from scrapy import log, signals
from test.spiders.testSpider import TestSpider
from test import settings, pipelines
from scrapy.utils.project import get_project_settings

spider = TestSpider(domain='test.com')
settings = get_project_settings()
crawler = Crawler(settings)
crawler.signals.connect(reactor.stop, signal=signals.spider_closed)
crawler.configure()
crawler.crawl(spider)
crawler.start()
log.start()
reactor.run()

现在我想用cx_Freeze把这些打包成一个可执行文件,使用setup.py,就像其他话题中提到的那样 这里。以下是代码:

from cx_Freeze import setup, Executable

includes = ['scrapy', 'pkg_resources', 'lxml.etree', 'lxml._elementpath']

build_options = {'compressed' : True,
                'optimize' : 2,
                'namespace_packages' : ['zope', 'scrapy', 'pkg_resources'],
                'includes' : includes,
                'excludes' : []}

executable = Executable(script='main.py',
                        copyDependentFiles=True,
                        includes=includes)

setup(name='Stand-alone scraper',
      version='0.1',
      description='Stand-alone scraper',
      options= {'build_exe': build_options},
      executables=[executable])

通常情况下,它会编译成exe文件。但是当我尝试运行它时,问题就来了:

Traceback (most recent call last):
  File "C:\Python27\lib\site-packages\cx_Freeze\initscripts\Console.py", line 27, in       <module>
    exec code in m.__dict__
  File "main.py", line 2, in <module>
    from scrapy.crawler import Crawler
  File "C:\Python27\lib\site-packages\scrapy\__init__.py", line 6, in <module>
    __version__ = pkgutil.get_data(__package__, 'VERSION').strip()
  File "C:\Python27\lib\pkgutil.py", line 591, in get_data
    return loader.get_data(resource_name)
IOError: [Errno 2] No such file or directory: 'scrapy\\VERSION'

我解决这个问题的方法是把scrapy\version文件从原始源代码(python\lib\site-packages\scrapy)移动到build文件夹里的library.zip\scrapy中。第二次运行main.exe时,我收到了另一个消息:

Traceback (most recent call last):
  File "C:\Python27\lib\site-packages\cx_Freeze\initscripts\Console.py", line 27, in <module>
    exec code in m.__dict__
  File "main.py", line 11, in <module>
    crawler = Crawler(settings)
  File "C:\Python27\lib\site-packages\scrapy\crawler.py", line 20, in __init__
    self.stats = load_object(settings['STATS_CLASS'])(self)
  File "C:\Python27\lib\site-packages\scrapy\utils\misc.py", line 42, in load_object
    raise ImportError("Error loading object '%s': %s" % (path, e))
ImportError: Error loading object 'scrapy.statscol.MemoryStatsCollector': No module named statscol

我没有找到解决这个问题的方法,只能尝试在我的main.py中导入错误消息中的模块。简而言之——这并没有成功。每次新导入一个模块,我都会收到一个关于另一个模块的新消息(总共我尝试了15个模块 :)),直到遇到关于cryptography库中的aes模块的错误。我还尝试使用cx_freeze的替代方案,比如py2exe和pyinstaller,但结果是一样的。

有没有人能帮我解决这个问题?感谢你读到这里。

1 个回答

2

把你的 cx_Freeze 代码换成这个。

import sys 
    from cx_Freeze import setup, Executable 
    build_exe_options = {"packages": ["os","twisted","scrapy","test"], "excludes": ["tkinter"],"include_msvcr":True} 

    base = None
    setup(  name = "MyScript", 
            version = "0.1",
            description = "Demo", 
            options = {"build_exe": build_exe_options}, 
            executables = [Executable("C:\\MyScript", base=base)]) 

代码的不同之处在于,我把所有的包都包含进来了,这样你就可以使用它们里的所有功能了。

撰写回答