如何使用selenium从页面中提取url列表?

2024-04-26 12:12:18 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试提取https://shop.freedommobile.ca/devices在您单击每个电话下面的“查看选项”按钮并将它们放入字符串列表时所具有的所有URL。你知道吗

我将python与Selenium和wait库一起使用。 我已经试过在参数中使用.text了。但是,我不断遇到一个错误,指出:

typeError:“str”对象不可调用 第17行就是问题所在。你知道吗

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By


driver = webdriver.Chrome()

class phoneCost:


    driver.get("https://shop.freedommobile.ca/devices")

    # extract the names of the phones
    wait = WebDriverWait(driver, 20) #10 second wait
    XPathLocation = """B//*[@id="skip-navigation"]/div/div/div[1]/div/div[2]/a'"""
    phonePlanLinksRaw = wait.until(EC.presence_of_all_elements_located(By.XPATH(XPathLocation)))
    phonePlanLinks = []


    for element in range(len(phonePlanLinksRaw)):
        link = element
        phonePlanLinks.append(str(link))


    numLink = 1
    for element in range(len(phonePlanLinks)):
        print("phone " + str(numLink) + " : " + phonePlanLinks[element])
        numLink += 1

应以字符串格式返回URL列表:

[https://shop.freedommobile.ca/devices/Apple/iPhone_XS_Max?sku=190198786135&planSku=Freedom%20Big%20Gig%2015GB

https://shop.freedommobile.ca/devices/Apple/iPhone_XS?sku=190198790569&planSku=Freedom%20Big%20Gig%

https://shop.freedommobile.ca/devices/Apple/iPhone_XR?sku=190198776631&planSku=Freedom%20Big%20Gig%2015GB]

感谢您的帮助 谢谢


Tags: fromhttpsimportdivdriverseleniumelementshop
3条回答

这是你应该使用的逻辑。你知道吗

WebDriverWait(driver,10).until(EC.presence_of_all_elements_located((By.XPATH,"//div[starts-with(@class,'deviceListItem')]/a")))
mblOptions = driver.find_elements_by_xpath("//div[starts-with(@class,'deviceListItem')]/a")
mblUrls = []
for mblOption in mblOptions:
    mblUrls.append(mblOption.get_attribute('href'))

print (mblUrls)

输出:

根据{a1}}{{}{{}{{{}{{}{{{}{{}}{{}}{}{{}{{{{{{},各{“}”、“https://shop.freedommobile.ca/devices/Huawei/P30_lite?sku=886598061131&planSku=Freedom%20Big%20Gig%20%2B%20Talk%205GB”、“https://shop.freedommobile.ca/devices/Huawei/Mate_20_Pro?sku=886598058964&planSku=Freedom%20Big%20Gig%20%2B%20Talk%2010GB”、“https://shop.freedommobile.ca/devices/LG/X_Power_3?sku=652810831130&planSku=Freedom%20LTE%2B3G%209.5GB%20Promo”、“https://shop.freedommobile.ca/devices/LG/G8_ThinQ?sku=652810832434&planSku=Freedom%20Big%20Gig%20%2B%20Talk%2010GB”、“https://shop.freedommobile.ca/devices/LG/Q_Stylo_+?sku=652810831222&planSku=Freedom%202GB”、“https://shop.freedommobile.ca/devices/Alcatel/GoFLIP?sku=889063504010&planSku=Freedom%20500MB”、“https://shop.freedommobile.ca/devices/Bring_Your/Own_Device?sku=byod”]

要使用Selenium提取https://shop.freedommobile.ca/devices拥有的所有URL,必须为visibility_of_all_elements_located()导出WebDriverWait,并且可以使用以下Locator Strategy

  • 代码块:

    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    
    options = webdriver.ChromeOptions() 
    options.add_argument("start-maximized")
    # options.add_argument('disable-infobars')
    driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
    driver.get("https://shop.freedommobile.ca/devices")
    print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//a[text()='See Options']")))])
    
  • 控制台输出:

    ['https://shop.freedommobile.ca/devices/Apple/iPhone_XS_Max?sku=190198786135&planSku=Freedom%20Big%20Gig%2015GB', 'https://shop.freedommobile.ca/devices/Apple/iPhone_XS?sku=190198790569&planSku=Freedom%20Big%20Gig%2015GB', 'https://shop.freedommobile.ca/devices/Apple/iPhone_XR?sku=190198776631&planSku=Freedom%20Big%20Gig%2015GB', 'https://shop.freedommobile.ca/devices/Apple/iPhone_8_Plus?sku=190198454249&planSku=Freedom%20Big%20Gig%20%2B%20Talk%205GB', 'https://shop.freedommobile.ca/devices/Apple/iPhone_8?sku=190198450944&planSku=Freedom%20Big%20Gig%20%2B%20Talk%205GB', 'https://shop.freedommobile.ca/devices/Samsung/Galaxy_S10+?sku=887276301570&planSku=Freedom%20Big%20Gig%20%2B%20Talk%205GB', 'https://shop.freedommobile.ca/devices/Samsung/Galaxy_S10?sku=887276312163&planSku=Freedom%20Big%20Gig%20%2B%20Talk%2015GB', 'https://shop.freedommobile.ca/devices/Samsung/Galaxy_S10e?sku=887276313870&planSku=Freedom%20Big%20Gig%2015GB', 'https://shop.freedommobile.ca/devices/Samsung/Galaxy_Tab_A_8_LTE?sku=887276299440&planSku=Promo%20Tablet%2015', 'https://shop.freedommobile.ca/devices/Samsung/Galaxy_Note9?sku=887276279916&planSku=Freedom%20Big%20Gig%2015GB', 'https://shop.freedommobile.ca/devices/Samsung/Galaxy_S9?sku=887276250861&planSku=Freedom%20Big%20Gig%20%2B%20Talk%205GB', 'https://shop.freedommobile.ca/devices/Motorola/G7_Power?sku=723755134249&planSku=Freedom%20Big%20Gig%20%2B%20Talk%205GB', 'https://shop.freedommobile.ca/devices/Motorola/Moto_E5_Play?sku=723755125940&planSku=Freedom%20LTE%2B3G%209.5GB%20Promo', 'https://shop.freedommobile.ca/devices/Google/Pixel_3a?sku=842776111326&planSku=Freedom%20Big%20Gig%20%2B%20Talk%205GB', 'https://shop.freedommobile.ca/devices/Google/Pixel_3?sku=842776109798&planSku=Freedom%20Big%20Gig%20%2B%20Talk%2010GB', 'https://shop.freedommobile.ca/devices/Google/Pixel_3_XL?sku=842776109828&planSku=Freedom%20Big%20Gig%20%2B%20Talk%2010GB', 'https://shop.freedommobile.ca/devices/ZTE/Z557?sku=885913107448&planSku=Freedom%20500MB', 'https://shop.freedommobile.ca/devices/LG/G7_ThinQ?sku=652810830737&planSku=Freedom%20Big%20Gig%20%2B%20Talk%205GB', 'https://shop.freedommobile.ca/devices/Huawei/P30_lite?sku=886598061131&planSku=Freedom%20Big%20Gig%20%2B%20Talk%205GB', 'https://shop.freedommobile.ca/devices/Huawei/Mate_20_Pro?sku=886598058964&planSku=Freedom%20Big%20Gig%20%2B%20Talk%2010GB', 'https://shop.freedommobile.ca/devices/LG/X_Power_3?sku=652810831130&planSku=Freedom%20LTE%2B3G%209.5GB%20Promo', 'https://shop.freedommobile.ca/devices/LG/G8_ThinQ?sku=652810832434&planSku=Freedom%20Big%20Gig%20%2B%20Talk%2010GB', 'https://shop.freedommobile.ca/devices/LG/Q_Stylo_+?sku=652810831222&planSku=Freedom%202GB', 'https://shop.freedommobile.ca/devices/Alcatel/GoFLIP?sku=889063504010&planSku=Freedom%20500MB', 'https://shop.freedommobile.ca/devices/Bring_Your/Own_Device?sku=byod']
    

试着用列表理解来达到目的。请看一下您使用的这部分(By.XPATH(XPathLocation))),它应该是wait.until(EC.visibility_of_all_elements_located((By.XPATH, "some_xpath")))。你知道吗

其中一个更像:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

with webdriver.Chrome() as driver:
    wait = WebDriverWait(driver, 10)
    driver.get("https://shop.freedommobile.ca/devices")
    item_links = [item.get_attribute("href") for item in wait.until(EC.visibility_of_all_elements_located((By.XPATH, "//a[contains(@class,'__DeviceDetailsButton')]")))]
    print(item_links)

相关问题 更多 >