如何使用ItemLoader的add\uxpath方法建立索引

2024-06-16 12:59:31 发布

男 | 程序猿一只，喜欢编程写python代码。

我正试图重写这段代码以使用ItemLoader类：

import scrapy

from ..items import Book


class BasicSpider(scrapy.Spider):
    ...
    def parse(self, response):
        item = Book()

        # notice I only grab the first book among many there are on the page             
        item['title'] = response.xpath('//*[@class="link linkWithHash detailsLink"]/@title')[0].extract()
        return item

上述方法非常有效。现在与ItemLoader相同：

from scrapy.loader import ItemLoader

class BasicSpider(scrapy.Spider):
    ...    
    def parse(self, response):
        l = ItemLoader(item=Book(), response=response)

        l.add_xpath('title', '//*[@class="link linkWithHash detailsLink"]/@title'[0])  # this does not work - returns an empty dict
        # l.add_xpath('title', '//*[@class="link linkWithHash detailsLink"]/@title')  # this of course work but returns every book title there is on page, not just the first one which is required
        return l.load_item()

所以我只想抢到第一本书的书名，我怎么做到的？你知道吗

Tags： the from import title response link item xpath

1条回答

网友

1楼 · 发布于 2024-06-16 12:59:31

代码中的一个问题是Xpath使用基于一个的索引。另一个问题是索引括号应该在传递给add\uxpath方法的字符串中。你知道吗

所以正确的代码如下所示：

l.add_xpath('title', '(//*[@class="link linkWithHash detailsLink"]/@title)[1]')

如何使用ItemLoader的add\uxpath方法建立索引

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何使用ItemLoader的add\uxpath方法建立索引

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >