Selenium错误的选择器导致无输出

2024-05-13 23:17:24 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在努力清理这个网站 Best Western Mornington Hotel

酒店房间的名称和所述房间的价格。我正在使用Selenium来尝试刮取这些数据,但在我假设我使用了错误的选择器/XPATH之后,我一直没有得到任何回报。是否有任何方法可以识别正确的XPATH/div类/选择器?我觉得我已经选择了正确的,但没有输出

from re import sub
from decimal import Decimal
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import time

seleniumurl = 'https://www.bestwestern.co.uk/hotels/best-western-mornington-hotel-london-hyde-park-83187/in-2021-06-03/out-2021-06-05/adults-1/children-0/rooms-1'



driver = webdriver.Chrome(executable_path='C:\\Users\\Conor\\Desktop\\diss\\chromedriver.exe')
driver.get(seleniumurl)
time.sleep(5)
working = driver.find_elements_by_class_name('room-type-block')

for work in working:
    name = work.find_elements_by_xpath('.//div/h4').string
    price = work.find_elements_by_xpath('.//div[2]/div[2]/div/div[1]/div/div[3]/div/div[1]/div/div[2]/div[1]/div[2]/div[1]/div[1]/span[2]').string
    print(name,price)

Tags: namefromimportdivsupportbydriverselenium
3条回答

我只在Java中使用Selenium,但从中可以看出,您正在尝试获取WebElements集合并对其调用toString()

应该是通过\u xpath查找\u元素,只获取一个WebElement,然后调用.text而不是.string

请使用此选择器div#rr_wrp div.room-type-block.visibility_of_all_elements_located方法获取类别div列表

使用上述选择器,您可以通过以下xpath搜索标题:.//h2[@class="room-type title"],通过.//strong[@class="trimmedTitle rt-item title"]和price.//div[@class="rt-rate-right row group"]//span[@data-bind="text: priceText"]搜索子类别

请使用^{}循环尝试以下代码以提取并行列表:

driver = webdriver.Chrome(executable_path='C:\\Users\\Conor\\Desktop\\diss\\chromedriver.exe')

driver.get('https://www.bestwestern.co.uk/hotels/best-western-mornington-hotel-london-hyde-park-83187/in-2021-06-03/out-2021-06-05/adults-1/children-0/rooms-1')

wait = WebDriverWait(driver, 20)

elements = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, 'div#rr_wrp div.room-type-block')))
for element in elements:
    for room_title in element.find_elements_by_xpath('.//h2[@class="room-type title"]'):
        print("Main Title ==>> " +room_title.text)
        for room_type, room_price in zip(element.find_elements_by_xpath('.//strong[@class="trimmedTitle rt-item title"]'), element.find_elements_by_xpath('.//div[@class="rt-rate-right row group"]//span[@data-bind="text: priceText"]')) :
            print(room_type.text +" " +room_price.text)
            
driver.quit()

Marek正确地使用了.text而不是.string。或者使用.get_属性(“innerHTML”)。我还认为您的xpath可能是错误的,除非我看到了错误的页面。以下是您链接的页面中的一些XPath

#This will get all the room type sections.
roomTypes = driver.find_elements_by_xpath("//div[contains(@class,'room-type-box__content')]")

#This will get the room type titles
roomTypes.find_elements_by_xpath("//div[contains(@class,'room-type-title')]/h3")

#Print out room type titles
for r in roomTypes:
    print(r.text)

相关问题 更多 >