尝试在简单的示例上创建一个网站,但在json中输出null

2024-04-16 15:36:30 发布

您现在位置:Python中文网/ 问答频道 /正文

学习刮痧尝试一个简单的例子,但不是简单的我。我试了几个小时从linux到windows。在“刮痧爬天涯”之后项目.json-这里的“tiantian”是spider的名字,里面没有任何输出和记录项目.json. 这是剧本,请帮忙。你知道吗

项目:

import scrapy

class MeijuItem(scrapy.Item):
    pass

class tiantianitem(scrapy.Item):

    name=scrapy.Field()
    link=scrapy.Field()

卡盘:

import scrapy
from meiju.items import tiantianitem

class TiantianSpider(scrapy.Spider):
   name = "tiantian"
   allowed_domains = ["cn163.net/archives/18296"]
   start_urls = (
    'http://cn163.net/archives/18296/',
)

def parse(self, response):
    i=tiantianitem
    items=[]
    sites=response.xpath('/html/body/div[1]/div[4]/div[2]/div[3]/div/p[2]/a')
    for site in sites:
        i['name']=site.xpath('/text()').extract()
        i['link']=site.xpath('/@href').extract()
        items.append(i)
    return items

设置:

BOT_NAME = 'meiju'

SPIDER_MODULES = ['meiju.spiders']
NEWSPIDER_MODULE = 'meiju.spiders'

ROBOTSTXT_OBEY = True

DOWNLOAD_DELAY = 2

COOKIES_ENABLED = False

DOWNLOADER_MIDDLEWARES = {
    'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None,
    'scrapy_fake_useragent.middleware.RandomUserAgentMiddleware': 400,
}

Tags: 项目nameimportdivjsonfieldsiteitems