我不是通过使用selenium获得网站的所有元素

2024-04-26 21:55:34 发布

您现在位置:Python中文网/ 问答频道 /正文

代码如下:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.expected_conditions import presence_of_element_located
from selenium.webdriver.support import expected_conditions as EC
import time
import sys


login_url = 'https://www.researchgate.net/login'
base_url = "https://www.researchgate.net/institution/Islamia_College_Peshawar/department/Department_of_Computer_Science/members"
chrome_driver_path = 'home/danish-khan/scrapers/researchgate/chromedriver'

chrome_options = Options()
#chrome_options.add_argument('--headless')

webdriver = webdriver.Chrome(
  executable_path=chrome_driver_path, options=chrome_options
)

# default login credential and search query
username = 'your username'
password = 'your password'

with webdriver as driver:
    # Set timeout time 
    wait = WebDriverWait(driver, 2)

    # retrive url in headless browser
    driver.get(login_url)
    
    driver.find_element_by_id("input-login").send_keys(username)
    driver.find_element_by_id("input-password").send_keys(password)
    driver.find_element_by_class_name("nova-c-button__label").find_element(By.XPATH, "./..").click()
    time.sleep(2)

    driver.get(base_url)

    time.sleep(10)
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    time.sleep(20)
    names = driver.find_elements_by_css_selector('.display-name')
    print('total names:',len(names))
  
time.sleep(10)

driver.close()

这是输出

total names: 20 Traceback (most recent call last): File "/home/danish-khan/scrapers/scrpers/lib/python3.8/site-packages/urllib3/connection.py", line 159, in _new_conn conn = connection.create_connection( File "/home/danish-khan/scrapers/scrpers/lib/python3.8/site-packages/urllib3/util/connection.py", line 84, in create_connection raise err File "/home/danish-khan/scrapers/scrpers/lib/python3.8/site-packages/urllib3/util/connection.py", line 74, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused

我试图找出为什么它不返回所有元素,因为在网站上有30多个名字/个人资料,但它只显示20个名字。 我应用等待直到元素被找到的策略,但无法工作

有什么解决办法吗


Tags: fromimporturlhomebytimedriverselenium