如何从googletrends中提取标题/文本并通过Selenium和Python打印出来

2024-06-07 10:00:50 发布

您现在位置:Python中文网/ 问答频道 /正文

我想从这个网站的每一行中提取不同的标题:

https://trends.google.com/trends/trendingsearches/realtime?geo=AR&category=all

我试过几次都不走运。我想通过按类搜索元素,我会得到想要的文本:

from selenium import webdriver
driver=webdriver.Chrome('path to bin')
driver.get('https://trends.google.com/trends/trendingsearches/realtime?geo=AR&category=all')
hrefs = driver.find_elements_by_class_name('title')
print hrefs
print(len(hrefs))
driver.quit()

提前谢谢各位! 琼


Tags: httpscom网站drivergoogleallrealtimegeo
2条回答

你离得太近了!您只需从标题中获取文本,请尝试以下操作:

from selenium import webdriver

driver=webdriver.Chrome('path to bin')
driver.get('https://trends.google.com/trends/trendingsearches/realtime?geo=AR&category=all')
Titles = driver.find_elements_by_class_name('title')
for title in Titles:
    print(title.text)
driver.quit()

@PixelEinstein的回答将完美地满足您的需求。但作为最佳实践的一部分,您应该始终将浏览器窗口的最大化并诱导WebDriverWait使元素可见,然后提取其中的文本,如下所示:

  • 代码块:

    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    
    options = webdriver.ChromeOptions() 
    options.add_argument("start-maximized")
    options.add_argument('disable-infobars')
    driver=webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
    driver.get('https://trends.google.com/trends/trendingsearches/realtime?geo=AR&category=all')
    titles = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='title']")))
    for title in titles:
        print(title.text)
    driver.quit()
    
  • 控制台输出:

    ^{2美元

相关问题 更多 >

    热门问题