使用Python创建价格列表

2024-06-18 14:35:43 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在分析这个网站上的数据:Electricity prices

我试着用漂亮的汤来做:

from bs4 import BeautifulSoup
import requests
page = requests.get('https://transparency.entsoe.eu/transmission-domain/r2/dayAheadPrices/show?name=&defaultValue=false&viewType=TABLE&areaType=BZN&atch=false&dateTime.dateTime=01.10.2018+00:00%7CCET%7CDAY&biddingZone.values=CTY%7C10YAT-APG------L!BZN%7C10YAT-APG------L&dateTime.timezone=CET_CEST&dateTime.timezone_input=CET+(UTC+1)+/+CEST+(UTC+2)')
soup = BeautifulSoup(page.text, 'html.parser')
price_hide = soup.find(class_='dv-value-cell')
print(price_hide)

到目前为止:

<td class="dv-value-cell">
<span       onclick="showDetail('eu.entsoe.emfip.transmission_domain.r2.presentation.entity.DayAheadPricesMongoEntity', '5bb0b150623a7295d97e9b6d', '2018-09-30T22:00:00.000Z', 'PRICE', 'CET');">59.53</span>

但是我该怎么刮整张桌子呢?你知道吗


Tags: importfalsedatetimedomainpagerequestsr2timezone
3条回答

首先找到所有td标记,然后在每个标记中提取span标记内的文本值

timestamps=soup.find_all("td",class_="first")
prices=soup.find_all("td",class_="dv-value-cell")

for t,p in zip(timestamps,prices):
    print(t.text.strip()," ",p.span.text.strip())


00:00 - 01:00   59.53
01:00 - 02:00   56.10
02:00 - 03:00   51.41
03:00 - 04:00   47.38
04:00 - 05:00   47.59

这就是你要找的吗?你知道吗

from bs4 import BeautifulSoup
import requests
page = requests.get('https://transparency.entsoe.eu/transmission-domain/r2/dayAheadPrices/show?name=&defaultValue=false&viewType=TABLE&areaType=BZN&atch=false&dateTime.dateTime=01.10.2018+00:00%7CCET%7CDAY&biddingZone.values=CTY%7C10YAT-APG------L!BZN%7C10YAT-APG------L&dateTime.timezone=CET_CEST&dateTime.timezone_input=CET+(UTC+1)+/+CEST+(UTC+2)')
soup = BeautifulSoup(page.text, 'html.parser')
price_hide = soup.find_all(class_='dv-value-cell')
for price in price_hide:
    print(price.text.rstrip().lstrip())

具有以下输出:

59.53
56.10
51.41
47.38
47.59
51.61
69.13
77.32
...

您需要使用soup.find_all()而不是soup.find(),然后应用进一步的逻辑来提取所需的结果。你知道吗

相关问题 更多 >