X path无法提取spid中所需的元素

def parse(self, response): item = AmazonItem() item['url'] = response.url item['SellerName'] = response.xpath(".//*[@id='bylineInfo']/text()").extract()[0].strip() item['itemtitle'] = response.xpath(".//*[@id='productTitle']/text()").extract()[0].strip() item['rating'] = response.xpath(".//*[@class='a-icon-alt']/text()").extract()[0].strip() item['price'] = response.xpath(".//*[@class='a-size-medium a-color-price']/text()").extract()[0].strip() try: list = response.xpath(".//*[@class='a-unordered-list a-vertical a-spacing-none']/li/span[@class='a-list-item']/text()").extract() item['desc'] = [s.strip() for s in list] except IndexError: item['desc']="No Description"

3条回答

网友

1楼 · 编辑于 2024-04-25 23:50:12

确保避免使用复合类。我试着说明应该如何定义它们。您所需要做的就是将下面使用的xapth替换为您在scrapy项目中使用的xapth。你知道吗

import requests
from scrapy import Selector

url = "https://www.amazon.com/dp/B01NCX988Q/?tag=stackoverflow17-20"

res = requests.get(url,headers={"User-Agent":"Mozilla/5.0"})
sel = Selector(res)
product_url = res.url
seller = sel.xpath("//a[@id='bylineInfo']/text()").extract_first()
title = sel.xpath("//*[@id='productTitle']/text()").extract_first().strip()
rating = sel.xpath("//span[@class='a-icon-alt']/text()").extract_first().strip()
price = sel.xpath("//*[@id='priceblock_ourprice']/text()").extract_first().strip()
desc = [' '.join(item.split()) for item in sel.xpath("//*[@id='feature-bullets']//*[@class='a-list-item']/text()").extract()]
print(f'{product_url}\n{seller}\n{title}\n{rating}\n{price}\n{desc}')

网友

2楼 · 编辑于 2024-04-25 23:50:12

xpath表达式不要以“.”开头。它是用于实现xpath表达式的。你知道吗

from operator import methodcaller

if response.css('span.a-list-item::text'):
    item['description'] = filter(bool, map(methodcaller('strip'), response.css('span.a-list-item::text').extract()))
else:
    item['description'] = 'No Description'

网友

3楼 · 编辑于 2024-04-25 23:50:12

你没有描述的物品正好缺货，与没有描述的情况完全不同。下面的例子说明了当一件商品缺货时，这些属性永远不会出现。：）因此，首先检查产品的可用性，然后再检查其属性。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章