Python请求模块显示更多结果

import requests from bs4 import BeautifulSoup r = requests.get("https://www.game.co.uk/en/m/games/best-selling-games/best-selling-xbox-one-games/?merchname=MobileTopNav-_-XboxOne_Games-_-BestSellers") c = r.content soup = BeautifulSoup(c,"html.parser") all=soup.find_all("div",{"class":"product"}) for item in all: print(item.find({"h2": "productInfo"}).text.replace('\h2','').replace(" ", "")) print(item.find("span",{"class": "condition"}).text + " " + item.find("span",{"class": "value"}).text ) try: print(item.find_all("span",{"class": "condition"})[1].text + " " + item.find_all("span",{"class": "value"})[1].text ) except: print("No Preowned") print(" ")

2条回答

网友

1楼 · 编辑于 2024-04-18 19:44:59

尝试此代码以获取该页中的所有可用项。您可以使用chromedev工具来检索this url，其中有一个页码递增选项

from bs4 import BeautifulSoup 
import requests

page_link = "https://www.game.co.uk/en/m/games/best-selling-games/best-selling-xbox-one-games/?merchname=MobileTopNav-_-XboxOne_Games-_-BestSellers&pageNumber={}&pageMode=true"

page_no = 0

while True:
    page_no+=1
    res = requests.get(page_link.format(page_no))
    soup = BeautifulSoup(res.text,'lxml')
    container = soup.select(".productInfo h2")
    if len(container)<=1:break 

    for content in container:
        print(content.text)

最后几个标题的输出：

ARK Survival Evolved
Kingdom Come Deliverance Special Edition
Halo 5 Guardians
Sonic Forces
The Elder Scrolls Online: Summerset - Digital

网友

2楼 · 编辑于 2024-04-18 19:44:59

您需要使用支持javascript/jquery执行的webcrawler—即selenium（它在引擎盖下使用BoutifulSoup）您面临的问题是，当单击所提到的按钮时，您尝试访问的内容会通过javascript动态创建。当您请求页面时，没有创建要从中读取的其他html元素，因此BoutifulSoup找不到它们。使用selenium，您可以单击按钮/填写表单等等。也可以等待服务器创建要访问的内容

selenium的文档应该是自我解释的

相关问题更多 >

编程相关推荐

热门问题

热门文章