将BeautifulSoup转换为Selenium时出现异常

2024-05-14 04:30:09 发布

您现在位置:Python中文网/ 问答频道 /正文

我有下面的代码来刮网站,这是没有问题的,然后我只想使用硒,所以我把代码改成这个,然后我有错误,我不知道为什么,有人帮我吗?在

在webdriver.PhantomJS()错误

Exception: Message: {"errorMessage":"Element does not exist in cache"

在网络驱动程序.Chrome()错误:

^{pr2}$

仅硒代码

driver = webdriver.Chrome()  # or webdriver.PhantomJS()
a = driver.find_elements_by_css_selector(findTag + "." + findValue + " a")
img = driver.find_elements_by_css_selector(findTag + "#" + findValue + "img")
href = a.get_attribute('href')
src = img.get_attribute("src")

硒+美容组代码:

driver = webdriver.Chrome() # or webdriver.PhantomJS()
soup = bs4.BeautifulSoup(driver.page_source, "html.parser")

a = soup.find(findTag, class_=findValue).find_all("a")
img = soup.find(findTag, id=findValue).find_all("img")
href = a.get("href")
src = img.get("src")

Tags: or代码srcimggetdriver错误find
1条回答
网友
1楼 · 发布于 2024-05-14 04:30:09

你尝试过等待吗?具体如下:

# from selenium.webdriver.support.ui import WebDriverWait
# from selenium.webdriver.common.by import By
# from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome() # or webdriver.PhantomJS()

# Here check that your image is in the page's document.
wait = driver.WebDriverWait(driver, 30).until(EC.visibility_of_element_located((By.ID, "YourImgId"))) 

# Now try to find it in the DOM
img = driver.find_elements_by_css_selector(findTag + "#" + findValue + "img")
a = driver.find_elements_by_css_selector(findTag + "." + findValue + " a")


href = a.get_attribute('href')
src = img.get_attribute("src")

希望这有帮助:)

关于等待:http://selenium-python.readthedocs.io/waits.html

编辑:不是等待问题

只需导航到selenium页面,输入您的凭证,然后使用beauthoulsoup来刮取页面。那就没事了:)

^{pr2}$

输出:

>>> [u'http://ipcamera-viewer.com/image/?p=199619_20170301_201334_5668.jpg', u'http://ipcamera-viewer.com/image/?p=199619_20170301_201329_5611.jpg']

相关问题 更多 >