删除仅在单击btn时显示的html内容

2024-04-28 05:01:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我正试图从以下网站上获取信息: https://www.blockchain.com/btc/tx/800ce197af8a1a277ec314daba9c0b59c3ceee0f5beec415f5b8d54a3a9db96c

与以下类关联的所有项 “sc-19pxzmk-0 lhmncg” 基本上,这是给定比特币交易中的所有地址,但正如您在页面右侧看到的,有一个元素:

<a class="sc-1r996ns-0 AqGqw sc-1tbyx6t-1 kXxRxe iklhnl-0 boNhIO" opacity="1">Load more outputs... (1 remaining)</a>

这样,如果你点击它显示另一个地址,我如何动态打开它?到目前为止,我所尝试的是-

import requests
from bs4 import BeautifulSoup
from selenium import webdriver


output_class = 'sc-19pxzmk-0 lhmncg'
driver = webdriver.Chrome()
driver.get('https://www.blockchain.com/btc/tx/800ce197af8a1a277ec314daba9c0b59c3ceee0f5beec415f5b8d54a3a9db96c')
result = driver.execute_script("return document.documentElement.outerHTML")

soup = BeautifulSoup(result, 'lxml')
element = driver.find_elements_by_class_name(output_class)
inputs = soup.find_all('div', {'class': output_class})

nither the beautiful soup会返回额外的地址,也不会返回驱动程序


Tags: fromhttpsimportcomoutput地址wwwdriver
2条回答

如果使用selenium,则不需要使用Beautifulsoup来获取数据。使用element.click()直接单击元素并直接获取结果

from selenium import webdriver

output_class = 'sc-19pxzmk-0 lhmncg'
driver = webdriver.Chrome()
driver.get('https://www.blockchain.com/btc/tx/800ce197af8a1a277ec314daba9c0b59c3ceee0f5beec415f5b8d54a3a9db96c')

driver.find_element_by_css_selector(".azsi2v-2").click()

result_list = driver.find_elements_by_css_selector(".sc-19pxzmk-0")

for item in result_list:
    print(item.find_element_by_css_selector("a").get_attribute("href"))

这给了我:

https://www.blockchain.com/btc/address/1DC6cb6mFcTgJAwFDEB65Qn457BzDxs3Wh
https://www.blockchain.com/btc/address/3JRj8b1cngQ1nJHwVPRXj1NFXRVzhMDFTf
https://www.blockchain.com/btc/address/3PdareoJL1N8t2BQAnKcVqkS9cdQQo6gLY
https://www.blockchain.com/btc/address/3LhFL4QhhSdtwuPBK4rwD2Z7VwndGVeoKR
https://www.blockchain.com/btc/address/3Nyhd9vMKxep6QhquDSea7yPg9TpCAKTEF
https://www.blockchain.com/btc/address/12a5iTzFRJGZ4H3sZV6UZv6GrUTiwyKyR6
https://www.blockchain.com/btc/address/3K4Hh5LDyqdryj7Xd1FBNgheE2aQHee97X
https://www.blockchain.com/btc/address/3CeQRAViNuqXHH3AcjmdnArCEbRRAdyxCm
https://www.blockchain.com/btc/address/1KHfhqk78kaSf5t1eC48pyLuxPHYTDstcK
https://www.blockchain.com/btc/address/1QKfADjViFcwjCjkmwK84oPXVNNRRDY9VK
https://www.blockchain.com/btc/address/bc1qt0pa5a7j5ay5slqxeujjvxs6zyq7l5z0lf97flxge2std02pfdyqkwlhv4

要按网站的任何元素,您可以使用click()find_element_by_xpath。因此,对于您提到的元素,您可以使用以下内容:

driver.find_element_by_xpath("//a[@class='sc-1r996ns-0 AqGqw sc-1tbyx6t-1 kXxRxe iklhnl-0 boNhIO']").click()

然后,当您查看页面源时,它将按照您的要求进行更新:

driver.page_source

相关问题 更多 >