如何绕过selenium python web抓取中的错误

2024-05-04 11:02:40 发布

您现在位置:Python中文网/ 问答频道 /正文

我写了下面的网页屏蔽代码。没有if-else循环部分的代码可以正常工作,我打算这样做。我有一个url列表,我想把它抓取出来,如果在任何url中元素都不存在,那么我必须绕过这个url并转到下一个。我实现了绕过没有元素的url,但是我的正常抓取在else循环中无法正常工作。 有帮手吗?在

from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
import time
from selenium.common.exceptions import TimeoutException
from selenium.common.exceptions import NoSuchElementException  

urls= [
'http://www.marketsmojo.com/Stocks?StockId=1002687&Exchange=0'
]
f= open("lolly.txt","a+")
browser=webdriver.Chrome()
browser.maximize_window()
browser.get('http://www.marketsmojo.com/Stocks?StockId=565016&Exchange=0')
browser.find_element_by_xpath("//*[@id='step-0']/a/i").click()
for url in urls:
    browser.get(url)
    browser.execute_script("window.scrollTo(10,9500);")
    browser.implicitly_wait(2000)
    if browser.find_element_by_xpath("//div[contains(.,' No Shareholding data available ')]"):
        continue
    else:
        add=browser.find_element_by_css_selector('#btnShareholdingDashboardFullDetails')
        SearchButton = browser.find_element_by_css_selector('#btnShareholdingDashboardFullDetails')
        Hover = ActionChains(browser).move_to_element(add).move_to_element(SearchButton)
        Hover.click().perform()
        browser.find_elements_by_css_selector('#allquarters > div > table')
        add1 = browser.find_element_by_css_selector('#AllQuarters')
        SearchButton1 = browser.find_element_by_css_selector('#AllQuarters')
        Hover1 = ActionChains(browser).move_to_element(add).move_to_element(SearchButton1)
        Hover1.click().perform()
        data = []
        for tr in browser.find_elements_by_css_selector('#allquarters > div > table'):
            ths = tr.find_elements_by_tag_name('th')
            tds = tr.find_elements_by_tag_name('td')
            if ths: 
                data.append([th.text for th in ths])
            if tds: 
                data.append([td.text for td in tds])
            f.write(str(data))


browser.quit()

Tags: infromimportbrowserurlfordatamove