将html数据从网站转换为datafram

2024-04-25 22:04:09 发布

您现在位置:Python中文网/ 问答频道 /正文

我想将网站(https://projects.fivethirtyeight.com/soccer-predictions/super-lig/)中的数据放入pandas数据框中,但是当我尝试读取html时,出现以下错误:

ValueError: No tables found

下面是我使用的代码:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from bs4 import BeautifulSoup
from urllib.request import urlopen
from selenium import webdriver
from pandas.io.html import read_html

driver = webdriver.Chrome(executable_path="C:/Users/Admin/Documents/chromedriver_win32/chromedriver")

link = "https://projects.fivethirtyeight.com/soccer-predictions/super-lig/"

driver.get(link)

table = driver.find_element_by_xpath('//*[@id="forecast-table"]')

table_html = table.get_attribute('innerHTML')

df = read_html(table_html)

以下是table_html的(部分)外观:

'<thead><tr class="desktop"><th class="top nosort"></th><th class="top bordered-right rating nosort drop-6" colspan="3">Team rating</th><th class="top bordered-right nosort drop-1" colspan="5">avg. simulated season</th><th class="top bordered-right nosort show-1 drop-3" colspan="2">avg. simulated season</th><th class="top bordered nosort" colspan="4">end-of-season probabilities</th></tr><tr class="sep"><th colspan="11"></th></tr><tr class="lower"><th class="team bold" data-tsorter="data-str">team</th><th class="num rating overall drop-6" data-tsorter="data-val">spi</th><th class="num rating offense drop-6" data-tsorter="data-val">off.</th><th class="num rating defense drop-6" data-tsorter="data-val">def.</th><th class="num wins record drop-1" data-tsorter="numeric">W</th><th class="num ties record drop-1" data-tsorter="numeric">D</th><th class="num losses record drop-1" data-tsorter="numeric">L</th><th class="num record drop-3" data-tsorter="numeric">goal diff.</th><th class="num record drop-3" data-tsorter="data-val"><span class="long-points">proj. pts.</span><span class="short-points">pts.</span></th><th class="pct drop-5" data-tsorter="data-val"><span class="full-relegated">relegated</span><span class="small-relegated">rel.</span></th><th class="pct" data-tsorter="data-val"><span class="full-champ">qualify for UCL</span><span class="small-champ">qualify for UCL</span></th><th class="pct sorted" data-tsorter="data-val"><span class="drop-1">win Süper Lig</span><span class="small-league">win league</span></th></tr></thead><tbody><tr class="team-row" data-str="Galatasaray"><td class="team" data-str="galatasaray"><div class="logo"><img src="https://secure.espn.com/combiner/i?img=/i/teamlogos/soccer/500/432.png&amp;w=56" alt="team-logo" onerror="this.onerror=null; this.src=\'https://secure.

Tags: importdatatophtmltablevalnumtr