Python爬虫在任何不同的URL请求上返回相同的响应

2024-04-25 18:16:39 发布

您现在位置：Python中文网/ 问答频道 /正文

7072

网友

男 | 程序猿一只，喜欢编程写python代码。

我正在建立一个非常简单的刮刀，但有一个非常愚蠢的错误，我正在做的地方，我无法找到

在response方法中，我使用产品列表页面上所有产品的循环来获得对任何URL的相同响应

我正在下面添加代码，请帮助

def parse(self, response): 
    item = {}
    count = 0
    for single in response.xpath('//div[@class="_3O0U0u"]/div'):
        count+=1
        # print(count)
        item['data_id'] = single.xpath('.//@data-id').extract_first()
        item['price'] = single.xpath('.//div[@class="_1vC4OE"]/text()').extract_first()
        item['url'] = single.xpath('.//div[@class="_1UoZlX"]/a[@class="_31qSD5"]/@href').extract_first()
        if not item['url']:
            item['url'] = single.xpath('.//div[@class="_3liAhj _1R0K0g"]/a[@class="Zhf2z-"]/@href').extract_first()
        #print(item)
        if item['url']:
            yield scrapy.Request('https://www.somewebsite.com' + item['url'], callback = self.get_product_detail, priority = 1, meta={'item': item})
            # break

    next_page = response.xpath('//div[@class="_2zg3yZ"]/nav/a[@class="_3fVaIS"]/span[contains(text(),"Next")]/parent::a/@href').extract_first()
    if next_page:
        next_page =  'https://www.somewebsite.com'+response.xpath('//div[@class="_2zg3yZ"]/nav/a[@class="_3fVaIS"]/span[contains(text(),"Next")]/parent::a/@href').extract_first()
        yield scrapy.Request(next_page, callback=self.parse ,priority=1)

def get_product_detail(self, response):
    dict_item = response.meta['item']
    sku = dict_item['data_id']
    print('dict SKU ======== ', sku)

Tags： self div url response count page extract item

0条回答

目前没有回答

Python爬虫在任何不同的URL请求上返回相同的响应

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python爬虫在任何不同的URL请求上返回相同的响应

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >