在项目中使用Scrapy运行Crawl命令

10 投票
2 回答
17075 浏览
提问于 2025-04-16 11:46

我刚开始学习Python和Scrapy,现在正在跟着Scrapy的教程走。我通过DOS界面创建了我的项目,输入了:

scrapy startproject dmoz

教程后面提到了Crawl命令:

scrapy crawl dmoz.org

但是每次我尝试运行这个命令时,都会收到一条消息,说这个命令不合法。我进一步查找后发现,我需要在一个项目里面,而这正是我搞不明白的地方。我试着进入我在startproject里创建的“dmoz”文件夹,但Scrapy根本不认得这个文件夹。

我相信我一定漏掉了什么明显的东西,希望有人能帮我指出来。

2 个回答

2

你的PATH环境变量没有设置。

你可以通过找到系统属性来设置Python和Scrapy的PATH环境变量。具体步骤是:右键点击“我的电脑”,选择“属性”,然后点击“高级系统设置”。接着在弹出的窗口中,切换到“高级”标签页,点击“环境变量”按钮。在新打开的窗口中,找到系统变量里的“变量路径”,然后在里面添加以下几行,用分号隔开。

C:\{path to python folder}
C:\{path to python folder}\Scripts

例如:

C:\Python27;C:\Python27\Scripts

9

你需要在你的 'startproject' 文件夹里执行这个命令。如果它找到了你的 scrapy.cfg 文件,你会看到其他的命令。你可以在这里看到它们的区别:

$ scrapy startproject bar
$ cd bar/
$ ls
bar  scrapy.cfg
$ scrapy
Scrapy 0.12.0.2536 - project: bar

Usage:
  scrapy <command> [options] [args]

Available commands:
  crawl         Start crawling from a spider or URL
  deploy        Deploy project in Scrapyd target
  fetch         Fetch a URL using the Scrapy downloader
  genspider     Generate new spider using pre-defined templates
  list          List available spiders
  parse         Parse URL (using its spider) and print the results
  queue         Deprecated command. See Scrapyd documentation.
  runserver     Deprecated command. Use 'server' command instead
  runspider     Run a self-contained spider (without creating a project)
  server        Start Scrapyd server for this project
  settings      Get settings values
  shell         Interactive scraping console
  startproject  Create new project
  version       Print Scrapy version
  view          Open URL in browser, as seen by Scrapy

Use "scrapy <command> -h" to see more info about a command


$ cd ..
$ scrapy
Scrapy 0.12.0.2536 - no active project

Usage:
  scrapy <command> [options] [args]

Available commands:
  fetch         Fetch a URL using the Scrapy downloader
  runspider     Run a self-contained spider (without creating a project)
  settings      Get settings values
  shell         Interactive scraping console
  startproject  Create new project
  version       Print Scrapy version
  view          Open URL in browser, as seen by Scrapy

Use "scrapy <command> -h" to see more info about a command

撰写回答