您好,我正在尝试从此网页提取表中所有篮球赛事的URL:https://www.oddsportal.com/matches/basketball/20200907/
以下是我的python脚本:
#!/usr/bin/python3
# -*- coding: utf-8 -*-
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = Options()
options.headless = True
options.add_argument("window-size=1400,800")
options.add_argument("--no-sandbox")
options.add_argument("--disable-gpu")
options.add_argument("start-maximized")
options.add_argument("enable-automation")
options.add_argument("--disable-infobars")
options.add_argument("--disable-dev-shm-usage")
driver = webdriver.Chrome(options=options)
driver.get("https://www.oddsportal.com/matches/basketball/20200907/")
url_links = [my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//a[contains(@href, '/basketball/')]")))]
print(len(url_links), '\n')
print(url_links, '\n')
driver.close()
driver.quit()
输出为我提供了表和其他表的URL。在我的例子中,我只希望9个URL链接到9个篮球赛事。如何过滤这些URL
谢谢
目前没有回答
相关问题 更多 >
编程相关推荐