从静态网站中删除表

import requests from bs4 import BeautifulSoup URL = 'https://www.iana.org/domains/root/db' page = requests.get(URL) soup = BeautifulSoup(page.content, 'html.parser') results = soup.find(id='tld-table')

2条回答

网友

1楼 · 编辑于 2024-05-29 06:13:33

您可以使用pandaspd.read_html

import pandas as pd

URL = "https://www.iana.org/domains/root/db"

df = pd.read_html(URL)[0]

print(df.head())
    Domain     Type                            TLD Manager
0     .aaa  generic  American Automobile Association, Inc.
1    .aarp  generic                                   AARP
2  .abarth  generic         Fiat Chrysler Automobiles N.V.
3     .abb  generic                                ABB Ltd
4  .abbott  generic              Abbott Laboratories, Inc.

网友

2楼 · 编辑于 2024-05-29 06:13:33

Pandas已经提供了一些可阅读的表格from html，无需使用BeautifulSoup：

import pandas as pd

url = "https://www.iana.org/domains/root/db"
# This returns a list of DataFrames with all tables in the page. 
df = pd.read_html(url)[0]

相关问题更多 >

编程相关推荐

热门问题

热门文章

从静态网站中删除表

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >