在Python中使用BeautifulSoup创建下一个网页

2024-04-24 23:57:25 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图在网站上搜索AirBnBs网站,当我运行AirBnB代码时,我会得到随机输出。三分之四的时间,我得到了一个错误,四分之一的时间,我得到了想要的结果。这使得我无法获得想要的html代码,它也禁用了循环“while True”,因为我得到了这些随机错误。你知道我能做些什么来得到合适的汤,而不是在这种情况下“没有”吗

# copy of code part 1 - get the first page
import requests
from bs4 import BeautifulSoup

url = 'https://www.airbnb.com/s/Honolulu--HI--United-States/homes?tab_id=home_tab&refinement_paths%5B%5D=%2Fhomes&flexible_trip_dates%5B%5D=november&flexible_trip_dates%5B%5D=october&flexible_trip_lengths%5B%5D=weekend_trip&date_picker_type=calendar&checkin=2021-10-28&checkout=2021-10-31&source=structured_search_input_header&search_type=autocomplete_click&place_id=ChIJTUbDjDsYAHwRbJen81_1KEs&federated_search_session_id=a49caa0f-ce1b-4ac0-a7af-ffbc9f5159f5&pagination_search=true&items_offset=20&section_offset=2%27:%20No%20schema%20supplied.%20Perhaps%20you%20meant%20http://www.airbnb.com/s/Honolulu--HI--United-States/homes?tab_id=home_tab&refinement_paths%5B%5D=%2Fhomes&flexible_trip_dates%5B%5D=november&flexible_trip_dates%5B%5D=october&flexible_trip_lengths%5B%5D=weekend_trip&date_picker_type=calendar&checkin=2021-10-28&checkout=2021-10-31&source=structured_search_input_header&search_type=autocomplete_click&place_id=ChIJTUbDjDsYAHwRbJen81_1KEs&federated_search_session_id=a49caa0f-ce1b-4ac0-a7af-ffbc9f5159f5&pagination_search=true&items_offset=20&section_offset=2?'
base = 'https://www.airbnb.com'

def get_url(url):
    page = requests.get(url)
    print(page)
    soup = BeautifulSoup(page.text, 'lxml')
    return soup 

soup = get_url(url)
#soup

# copy of code part 2 - get all pages
count = 1
while True:

    next_page = soup.find('a', {'aria-label':'Next'}) # find next button
    suffix_link = next_page.get('href')
    new_url = base + suffix_link
    print(new_url)

    url = new_url
    page = requests.get(url)
    soup = BeautifulSoup(page.text, 'lxml')
    print(count)
    count=count+1

enter image description here