我正在为此起始url https://usa.ingrammicro.com/_layouts/CommerceServer/IM/search2.aspx#PNavDS=N:0&t=pTab编写爬虫程序,现在使用以下代码:
class IngrammicroSpiderSpider(scrapy.Spider):
name = 'ingrammicro_spider'
allowed_domains = ['usa.ingrammicro.com']
start_urls = [f'https://usa.ingrammicro.com/_layouts/CommerceServer/IM/search2.aspx#PNavDS=N:0,Nao:{str(x)}&t=pTab' for x in range(0, 912990 + 1, 10)]
def start_requests(self):
for url in self.start_urls:
yield SplashRequest(url, self.parse, args={'wait': 10.0})
我想在paginator中导航所有页面,将“每页”设置为100个元素,现在只加载10个元素,我在network-XHR中搜索了标题和Cookie,但还找不到任何与此相关的设置,如何做到这一点?我只想要9000页乘100个元素,而不是90000页乘10个元素。 我不是说把网址改成这样:
start_urls = [f'https://usa.ingrammicro.com/_layouts/CommerceServer/IM/search2.aspx#PNavDS=N:0,Nao:{str(x)}&t=pTab' for x in range(0, 900001, 100)]
因为它仍然会得到10个元素在页面上,即0-10,然后100-110,然后200-210等
更改为
相关问题 更多 >
编程相关推荐