import pandas as pd
# Read in all tables at this address as pandas dataframes
results = pd.read_html('https://markets.cboe.com/europe/equities/market_share/index/all')
# Grab the second table founds
df = results[1]
# Set the first column as the index
df = df.set_index(0)
# Switch columns and indexes
df = df.T
# Drop any columns that have no data in them
df = df.dropna(how='all', axis=1)
# Set the column under "Displayed Price Venues" as the index
df = df.set_index('Displayed Price Venues')
# Switch columns and indexes again
df = df.T
# Aesthetic. Don't like having an index name myself!
del df.index.name
# Separate the three subtables from each other!
displayed = df.iloc[0:18]
non_displayed = df.iloc[18:-1]
total = df.iloc[-1]
您还可以以更积极紧凑的方式来实现这一点(相同的代码,但不需要分解步骤):
import pandas as pd
# Read in all tables at this address as pandas dataframes
results = pd.read_html('https://markets.cboe.com/europe/equities/market_share/index/all')
# Do all the stuff above in one go
df = results[1].set_index(0).T.dropna(how='all',axis=1).set_index('Displayed Price Venues').T
# Aesthetic. Don't like having an index name myself!
del df.index.name
# Separate the three subtables from each other!
displayed = df.iloc[0:18]
non_displayed = df.iloc[18:-1]
total = df.iloc[-1]
问题是
id
一直在动态变化。否则的话,我就用这个了,但不行。假设输出值就是您所要寻找的,这应该是可行的,只要内容没有改变或改变我建议给熊猫html阅读器一个机会:
您还可以以更积极紧凑的方式来实现这一点(相同的代码,但不需要分解步骤):
相关问题 更多 >
编程相关推荐