我目前从beautifulsoup获得了这个表,并希望将其拆分为多个数据帧,我希望每次出现绿色标题元素时都将其拆分
以下是网页: http://www.greyhound-data.com/d?page=stadia&st=1011&land=au&stadiummode=3
这就是我现在所拥有的,因为我无法理解,我习惯于这些问题只是分开的表格
url = "http://www.greyhound-data.com/d?page=stadia&st=1011&land=au&stadiummode=3"
req = requests.get(url).text
soup = BeautifulSoup(req, 'lxml')
table = soup.find_all("table", attrs={'id': "green"})
table = table[-1]
df = pd.read_html(str(table))[0]
output:
Year quarter ... Set on
Distance: 331 m / 362 y ... Distance: 331 m / 362 y
0 2020 2nd ... 15 JUN 2020
1 2020 1st ... 23 JAN 2020
2 2019 4th ... 6 OCT 2019
3 2019 3rd ... 1 SEP 2019
4 2019 2nd ... 28 APR 2019
.. ... ... ...
319 2002 3rd ... 5 SEP 2002
320 2002 2nd ... 6 JUN 2002
321 2001 4th ... 18 OCT 2001
322 2001 3rd ... 16 AUG 2001
323 2001 2nd ... 14 JUN 2001
[324 rows x 7 columns]
此脚本将表拆分为几个数据帧:
印刷品:
编辑:要获取距离列,请执行以下操作:
印刷品:
相关问题 更多 >
编程相关推荐