用Selenium创建下一页

pages_remaining = True page = 2 //starts @ page 2 since page one is scraped already with first loop while pages_remaining: //scrape code try: wait = WebDriverWait(browser, 20) wait.until(EC.element_to_be_clickable((By.LINK_TEXT, str(page)))).click() print browser.current_url page += 1 except TimeoutException: pages_remaining = False

https://shop.nordstrom.com/c/sale-mens-designer-clothing-accessories- shoes?breadcrumb=Home%2FSale%2FMen%2FDesigner&page=2&sort=Boosted https://shop.nordstrom.com/c/sale-mens-designer-clothing-accessories-shoes?breadcrumb=Home%2FSale%2FMen%2FDesigner&page=3&sort=Boosted https://shop.nordstrom.com/c/sale-mens-designer-clothing-accessories-shoes?breadcrumb=Home%2FSale%2FMen%2FDesigner&page=4&sort=Boosted

2条回答

网友

1楼 · 编辑于 2024-04-19 15:40:12

这个溶液很漂亮，因为我对硒不太熟悉。你知道吗

尝试用页数创建一个新变量。如您所见，当您进入下一页时，URL会发生变化，因此只需操作给定的URL即可。请参阅下面的代码示例。你知道吗

# Define variable pages first
pages = [str(i) for i in range(1,53)] # 53 'cuz you have 52 pages

for page in pages:
    response = get("https://shop.nordstrom.com/c/sale-mens-clothing?origin=topnav&breadcrumb=Home%2FSale%2FMen%2FClothing&page=" + page + "&sort=Boosted"
# Rest of you code

这个片段应该可以完成其余页面的工作。希望这有帮助，尽管这可能不是你一直在寻找的。你知道吗

如果您有任何问题，请发到下面。;). 你知道吗

干杯。你知道吗

网友

2楼 · 编辑于 2024-04-19 15:40:12

您可以通过更改url循环浏览页码，直到不再显示任何结果：

from bs4 import BeautifulSoup
from selenium import webdriver

base_url = "https://m.shop.nordstrom.com/c/sale-mens-clothing?origin=topnav&breadcrumb=Home%2FSale%2FMen%2FClothing&page={}&sort=Boosted"

driver = webdriver.Chrome()

page = 1
soup = BeautifulSoup("")

#Will loop untill there's no more results
while "Looks like we don’t have exactly what you’re looking for." not in soup.text:
    print(base_url.format(page))
    #Go to page
    driver.get(base_url.format(page))
    soup = BeautifulSoup(driver.page_source)

    ### your extracting code

    page +=1

相关问题更多 >

编程相关推荐

热门问题

热门文章