Python硒。抓取网页

2024-05-15 09:49:17 发布

您现在位置:Python中文网/ 问答频道 /正文

我想从url的“Stock Style-Weight”中的框中获取数据https://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F00000NF9P&tab=3'使用硒

此数据位于iframe中。我可以切换到iframe并单击按钮=‘重量’,但我无法获得九位数

下面是我的代码

driver = webdriver.Chrome(chromedriver)
driver.get("https://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F00000NF9P&tab=3")

iframe = WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.XPATH, "//iframe[@id='portfolio']")))
driver.switch_to.frame(iframe)

element1=driver.find_element_by_xpath('/html/body/div/sal-components-pillar-cards-process/div/div[2]/div/div[2]/div[2]')
element2=element1.find_element_by_css_selector("input[type='radio'][value='Weight']").click()

我试过几种选择

driver.find_element_by_xpath('*//div/div[2]/div/div[2]/div/svg/g/g[3]/g[2]/g[1]/text')
driver.find_element_by_css_selector("mbc-chart-group> g.style-box-text-layer > g:nth-child(1)")

但我也犯了同样的错误

NoSuchElementException: no such element: Unable to locate element

Tags: httpsdividbywwwdriversnapshotelement
2条回答

您需要添加这两行来单击Acceptcookies按钮和investor type按钮

WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//*[@id='onetrust-accept-btn-handler']"))).click()
WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//*[@id='btn_professional']"))).click()

完整代码

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome()
driver.get("https://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F00000NF9P&tab=3")

WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//*[@id='onetrust-accept-btn-handler']"))).click()
WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//*[@id='btn_professional']"))).click()

iframe = WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.XPATH, "//iframe[@id='portfolio']")))
driver.switch_to.frame(iframe)

element1=driver.find_element_by_xpath('/html/body/div/sal-components-pillar-cards-process/div/div[2]/div/div[2]/div[2]')
element2=element1.find_element_by_css_selector("input[type='radio'][value='Weight']").click()

元素位于svgtext标记中。要访问相同的内容,您需要使用:

//*[local-name()='svg'] or //*[name()='svg']

Link to refer

这些数字的Xpath是:

//div[@class='sal-stock-style__weight']//*[name()='svg' and @role='chart']//*[name()='g' and @class='style-box-text-layer']//*[name()='text']

尝试以下操作并确认:

numbers = driver.find_elements_by_xpath("//div[@class='sal-stock-style__weight']//*[name()='svg' and @role='chart']//*[name()='g' and @class='style-box-text-layer']//*[name()='text']")
for num in numbers:
    print(num.text)
15
6
4
22
14
2
19
13
2

相关问题 更多 >