使用BS4获取值

<th class="num record drop-3" data-tsorter="data-val"> proj. pts. pts. </th> <th class="pct" data-tsorter="data-val"> relegated rel. </th> <th class="pct" data-tsorter="data-val"> qualify for UCL make UCL </th> <th class="pct sorted" data-tsorter="data-val"> win Premier League win league </th>

url = 'https://projects.fivethirtyeight.com/soccer-predictions/premier-league/' r = requests.get(url = url) soup = BeautifulSoup(r.text, "html.parser") table = soup.find("table", {"class":"forecast-table"}) #print(table.prettify()) for i in table.find_all("td", {"class":"pct"}): print(i)

1条回答

网友

1楼 · 发布于 2024-06-01 05:44:25

不完全确定您想要什么特定列，但这将获取标记属性中带有data-val的所有列：

import requests
from bs4 import BeautifulSoup

url = 'https://projects.fivethirtyeight.com/soccer-predictions/premier-league/'
r = requests.get(url)

soup = BeautifulSoup(r.text, "html.parser")
table = soup.find("table", {"class": "forecast-table"})

team_rows = table.find_all("tr", {"class": "team-row"})

for team in team_rows:
    print("Team name: {}".format(team['data-str']))

    team_data = team.find_all("td")

    for data in team_data:
        if hasattr(data, 'attrs') and 'data-val' in data.attrs:
            print("\t{}".format(data.attrs['data-val']))
    print("\n")

如果我确实正确理解了您的问题，那么您正在查找最后两个值，它们在html源代码中没有标记。在这种情况下，您可以尝试简单地查找tag[6]，虽然它当然不是很健壮—但是这是html解析，所以“不是很健壮”是imho课程的标准。你知道吗

我在这里要做的是查找所有团队行（由于类名的缘故，这很容易），然后简单地遍历团队行td中的所有tr标记。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章