我试图从中获取的页面是https://www.investagrams.com/Stock/ac,我试图获取价格值(779.00
),但我的代码只返回:{{ViewStockPage.Data.Stock.LatestStockHistory.Last | numberPriceFormat}}
我的代码:
from bs4 import BeautifulSoup
import requests
r = requests.get('https://www.investagrams.com/Stock/ac')
soup = BeautifulSoup(r.text, "lxml")
main = soup.find('div', class_= 'd-flex flex-row justify-content-between')
header = main.find('h4', class_= 'mb-0')
price = header.find('span', class_= 'mr-2').string
print(price)
网站HTML:
<h4 class="mb-0">
<small class="ng-binding">Ayala Corporation (PSE:AC) </small> <br>
<strong>
<span class="mr-2 ng-binding" data-ng-class="ViewStockPage.Data.Stock.LatestStockHistory.LastClass">779.00 </span>
<span data-ng-class="{'stock-up-caret stockprice-up' : ViewStockPage.Data.Stock.LatestStockHistory.ChangePercentage > 0, 'stock-down-caret stockprice-down' : ViewStockPage.Data.Stock.LatestStockHistory.ChangePercentage < 0, 'glyphicon glyphicon-minus stockprice-flat': ViewStockPage.Data.Stock.LatestStockHistory.ChangePercentage == 0}" class="stock-down-caret stockprice-down" style=""> </span>
<span style="font-size: 13px; vertical-align: middle;" data-ng-class="{'stockprice-up' : ViewStockPage.Data.Stock.LatestStockHistory.ChangePercentage > 0, 'stockprice-down' : ViewStockPage.Data.Stock.LatestStockHistory.ChangePercentage < 0, 'stockprice-flat': ViewStockPage.Data.Stock.LatestStockHistory.ChangePercentage == 0}" class="stockprice-down">
<span class="ng-binding">-21.00 </span>
<span class="ml-1 ng-binding">-2.62% </span>
</span>
</strong>
</h4>
您试图从中获取的页面正在使用JavaScript异步填充DOM。您可以期望BeautifulSoup不适用于这样的页面,因为BeautifulSoup只能看到在服务器向您提供文档时直接烘焙到HTML中的内容
如果在浏览器中查看页面并记录网络流量,您将看到对各种REST API端点发出的多个请求,其中一个端点
/InvestaApi/Stock/ViewStock
,并将“股票代码”作为查询字符串参数。该端点的响应是JSON,包含您试图获取的信息。您只需模拟HTTP GET请求:输出:
此页面使用
JavaScript
在{{...}}
位置添加值,但requests
和Beautifulsoup
无法运行JavaScript
。您可能需要Selenium来控制可以运行JavaScript
的真实web浏览器使用
Firefox
/Chrome
(tab:Network
,filter:XHR
)中的DevTools
,我发现JavaScript
从https://webapi.investagrams.com/InvestaApi/Stock/ViewStock?stockCode=ac&defaultExchangeType=1&cv=1622292000-0-
使用带有一些标题的
requests
我也可以得到它。因为它以
JSON
的形式获取数据,所以我不需要BeautifulSoup
进行此操作结果:
相关问题 更多 >
编程相关推荐