我如何在尝试抓取网站时更改位置？

class CarrefoursaSpider(scrapy.Spider): name = 'carrefoursa' allowed_domains = ['www.carrefoursa.com'] start_urls = ['https://www.carrefoursa.com/meyve/c/1015'] custom_settings = { "LOG_FILE":"scrapy_logs/"+name+".log", "ROBOTSTXT_OBEY":False, "USER_AGENTS":None, "COOKIES_ENABLED":True, "COOKIES_DEBUG" : True } def parse(self,reponse): request = scrapy.Request( reponse.url, callback=self.parse_product,cookies={'Content-Language':'tr','currency': 'TRY', 'country': 'TR','lang': 'tr'}, dont_filter=True) yield request def parse_product(self, response): ...

1条回答

网友

1楼 · 发布于 2024-04-25 08:56:59

我给我的蜘蛛添加了一个元标记，它解决了我的问题

class CarrefoursaSpider(scrapy.Spider):
    name = 'carrefoursa'
    allowed_domains = ['www.carrefoursa.com']
    start_urls = ['https://www.carrefoursa.com/meyve/c/1015']
    meta={'proxy': 'xxx.xxx.xxx.xx:xxxx'},
    custom_settings = {
        "LOG_FILE":"scrapy_logs/"+name+".log",
        "ROBOTSTXT_OBEY":False,
        "USER_AGENTS":None,
        "COOKIES_ENABLED":True,
        "COOKIES_DEBUG" : True
        }
    def parse(self,reponse):
        request = scrapy.Request(
                reponse.url, callback=self.parse_product,cookies={'Content-Language':'tr','currency': 'TRY', 'country': 'TR','lang': 'tr'}, dont_filter=True)
        yield request
        
    def parse_product(self, response):
             ...

相关问题更多 >

编程相关推荐

热门问题

热门文章