如何使用beauthulsoup使用表id提取表

import requests from bs4 import BeautifulSoup url="https://afltables.com/afl/stats/teams/adelaide/2018_gbg.html" page=requests.get(url) soup=BeautifulSoup(page.content, 'html.parser') table=soup.find_all('table', id='sortableTable0') print(table)

1条回答

网友

1楼 · 发布于 2024-05-16 22:28:31

这个表是通过JavaScript动态生成的，所以您需要使用能够处理它的东西。Python中的一个选项是使用Selenium，如下所示：

from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Firefox()
driver.get("https://afltables.com/afl/stats/teams/adelaide/2018_gbg.html")

html = driver.page_source
soup = BeautifulSoup(html, "lxml")

table = soup.find_all('table', {'id':'sortableTable0'})
print(table)

有趣的是，页面源在包含表的div之前有以下元素：

<noscript>This page requires Javascript enabled to function<br><br></noscript>

编程相关推荐

java如何使用Ibatis在插入时返回ID（使用返回关键字）
java（org.hibernate.TransactionException）org。冬眠TransactionException:事务未成功启动
java小程序jwindow始终位于JNLP顶部
在Java中重新解析JSON对象？
java单击后将ListView数据移动到新屏幕
Mule ESB中的java WSA寻址特性
Java，对象编程：获取返回0值的方法
hibernate的Java通用问题，如何处理T get（K id）
java在使用超级CSV读取CSV时忽略引用
ssh使用Java远程运行命令

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何使用beauthulsoup使用表id提取表

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >