使用python的Selenium找不到特定链接

2024-04-29 03:21:12 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在寻找一个特定的链接从一个网站,自动生成其内容通过javascript。当我手动检查网站时,我可以很容易地找到我想要的链接。您可以看到下面的示例。基本上,我想找到一种方法来自动找到

<a href="/bsbe/document/JURE210005412/format/xsl/part/L?oi=5wDyMgzh8g&amp;sourceP=%7B%22source%22%3A%22TL%22%2C%22sort%22%3A%22date%22%7D"...><li class="toolpane_list_entry toolpane_list_entry_right">后面的标记

例如:

<li class="toolpane_list_entry toolpane_list_entry_right">
   <a href="/bsbe/document/JURE210005412/format/xsl/part/L?oi=5wDyMgzh8g&amp;sourceP=%7B%22source%22%3A%22TL%22%2C%22sort%22%3A%22date%22%7D" class="button bnext__button button--next bnext__button--next" id="docnextbuttontop" aria-disabled="false" aria-controls="id_docPanelContainer">
      <span>
         <span>Nächster Treffer</span>
         <em class="sicon" aria-hidden="true">
            <svg focusable="false" class="svg-icon-chevron_right" height="100%" viewBox="0 0 24 24" width="100%" xmlns="http://www.w3.org/2000/svg">
               <path fill="currentColor" d="M10 6L8.59 7.41 13.17 12l-4.58 4.59L10 18l6-6z"></path>
               <path d="M0 0h24v24H0z" fill="none"></path>
            </svg>
         </em>
      </span>
   </a>

但是,当我用selenium和python加载此页面时,我要查找的<a href...>不在那里(见下文)

<li class="toolpane_list_entry toolpane_list_entry_right">
   <span class="button bnext__button button--nextDisabled bnext__button--nextDisabled">
      <span>Nächster Treffer</span>
      <em aria-hidden="true" class="sicon">
         <svg class="svg-icon-chevron_right" focusable="false" height="100%" viewbox="0 0 24 24" width="100%" xmlns="http://www.w3.org/2000/svg">
            <path d="M10 6L8.59 7.41 13.17 12l-4.58 4.59L10 18l6-6z" fill="currentColor"></path>
            <path d="M0 0h24v24H0z" fill="none"></path>
         </svg>
      </em>
   </span>
</li>

如您所见,整体<;a href&燃气轮机;标签不在那里

以下是我的python代码:

from selenium.webdriver.firefox.options import Options as FirefoxOptions
from bs4 import BeautifulSoup
import os
from selenium import webdriver

options = FirefoxOptions()
options.add_argument("--headless")
firefox_driver = os.getcwd() +"\\geckodriver.exe"
driver = webdriver.Firefox(options=options, executable_path=firefox_driver)
driver.get("https://gesetze.berlin.de/bsbe/document/JURE210005730") 

# returns empty list
driver.find_elements_by_class_name("button bnext__button button--next bnext__button--next")


soup_file=driver.page_source
soup = BeautifulSoup(soup_file)
print(soup.find_all("li", {"class":"toolpane_list_entry toolpane_list_entry_right"}))

你知道问题出在哪里吗?你认为有办法提取链接吗?我能提供任何其他信息来发现问题吗

谢谢


Tags: pathsvgrightdriverbuttonlilistclass