Scrapy 随链接跟随编码

2024-04-25 13:27:08 发布

您现在位置：Python中文网/ 问答频道 /正文

2556

网友

男 | 程序猿一只，喜欢编程写python代码。

我一直在尝试实现一个解析函数。你知道吗

基本上，我是通过这个粘糊糊的外壳

response.xpath('//*[@id="PagerAfter"]/a[last()]/@href')).extract()[0]

给我下一页的网址。所以我试着按照下一页的说明去做。我环顾了一下堆栈溢出，似乎每个人都使用rule（LinkExtractor。。。我不认为我需要用它。我很确定我做得完全不对。我最初有一个for循环，在start\uURL中添加了我想访问的每个链接，因为我知道它都是*p1.html，*p2.html的形式。。等等，但我想让它更聪明。你知道吗

 def parse(self, response):
    items = []

    for sel in response.xpath('//div[@class="Message"]'):
        itemx = mydata()
        itemx['information'] = sel.extract()
        items.append(itemx)
        with open('log.txt', 'a') as f:
            f.write('\ninformation: ' + itemx.get('information')

    #URL of next page response.xpath('//*[@id="PagerAfter"]/a[last()]/@href').extract()[0]

    next_page = (response.xpath('//*[@id="PagerAfter"]/a[last()]/@href'))

    if (response.url != response.xpath('//*[@id="PagerAfter"]/a[last()]/@href')):
        if next_page:
            yield Request(response.xpath('//*[@id="PagerAfter"]/a[last()]/@href')[0], self.parse)


    return items

但不起作用我得到一个

    next_page = (response.xpath('//*[@id="PagerAfter"]/a[last()]/@href'))
        ^SyntaxError: invalid syntax

错误。另外，我知道屈服要求部分是错误的。我想递归地调用并递归地将每个页面的每个片段添加到列表项中。你知道吗

谢谢你！你知道吗

Tags： self id for parse response html page extract

0条回答

目前没有回答

Scrapy 随链接跟随编码

相关问题更多 >

编程相关推荐

热门问题

热门文章

Scrapy 随链接跟随编码

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >