selenium webdriver（chrome）元素与驱动程序页\u sou

2024-04-26 04:02:48 发布

您现在位置：Python中文网/ 问答频道 /正文

8219

网友

男 | 程序猿一只，喜欢编程写python代码。

我想从媒体上搜刮一篇文章。你知道吗

但失败的原因是selenium.webdriver.page\u源代码不包含目标div

[例如]在不到10分钟的时间内揭开Python装饰器的神秘面纱https://medium.com/@adrianmarkperea/demystifying-python-decorators-in-10-minutes-ffe092723c6c

在这个站点中，contentholder div的类是“x y z ab ac ez af ag”，但是这个元素不会出现在driver.page_source中。你知道吗

短代码：在下面。你知道吗

这不是那种超时问题。好像drive.page\u源不是用javascript处理的，但我不知道。你知道吗

ARTICLE = "https://medium.com/@adrianmarkperea/demystifying-python-decorators-in-10-minutes-ffe092723c6c"
driver.get(ARTICLE)
text_soup = BeautifulSoup(driver.page_source,"html5lib")

text = text_soup.select(".x.y.z.ab.ac.ez.af.ag")
print(text) # => []

我希望driver.page\u源与chrome开发人员控制台的元素相同。你知道吗

更新：我做了一些实验。我怀疑webdriver不能用javascript处理html源代码，所以我“seenium”了下面的html文件。你知道吗

但我得到了“元素删除”的html文件。你知道吗

结果：

webdriver和普通chrome控制台是相同的->；处理的

<html lang="en">
<body>

    <script type="text/javascript">
    document.querySelector("#id").remove();
    </script>

</body></html>

wget/请求->；未处理

<html lang="en">
<body>
    <div id="id">
        test element
    </div>    
    <script type="text/javascript">
    document.querySelector("#id").remove();
    </script>

</body></html>

Tags： text https div com id 元素源代码 html

0条回答

目前没有回答

selenium webdriver（chrome）元素与驱动程序页\u sou

相关问题更多 >

编程相关推荐

热门问题

热门文章

selenium webdriver（chrome）元素与驱动程序页\u sou

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >