用Selenium Webdriver和Python从XPath提取链接？

8 投票

1 回答

16649 浏览

提问于 2025-04-17 18:56

我对Selenium WebDriver和Python还比较陌生，所以我的问题可能有点基础。

我有以下的HTML代码：

<a class="wp-first-item" href="admin.php?page=account">Account</a>

我想要从中提取出链接的地址（href），我知道它的XPath是 ".//*[@id='toplevel_page_menu']/ul/li[2]/a"。

我该怎么做呢？

driver.find_element_by_xpath(".//*[@id='toplevel_page_menu']/ul/li[2]/a").link

或者

driver.find_element_by_xpath(".//*[@id='toplevel_page_menu']/ul/li[2]/a").href

似乎不太管用，结果是：

AttributeError: 'WebElement' object has no attribute 'link'

我希望得到的结果是 "admin.php?page=account"。

xpath web scraping HTML webdriver selenium link extraction

1 个回答

你可以使用 get_attribute：

element = driver.find_element_by_xpath(".//*[@id='toplevel_page_menu']/ul/li[2]/a")
href = element.get_attribute('href')
print href

通常我会用 Selenium 来打开一个网页，获取网页的源代码，然后用 BeautifulSoup 来解析这些代码：

from BeautifulSoup import BeautifulSoup

# On the current page
source = driver.page_source
soup = BeautifulSoup(source)

href = soup('<the tag containing the anchor>',{'id':'toplevel_page_menu'})[0]('ul')[0]('li')[2]('a')[0]['href']

不过，BeautifulSoup 不支持 xpath，所以上面的内容是你 xpath 的一种 BS 表示（根据我的理解）。

回答于 2025-04-17 由 Python大师

分享举报

用Selenium Webdriver和Python从XPath提取链接？

1 个回答

撰写回答