我的初始数据帧如下所示:
import pandas as pd
df = pd.DataFrame(data=[['Core','PM2',1234,'Direct','2019-11-08 00:00:00','2019-11-08 00:59:59',3.300,'V'],['Long Term','Wind',1111,'Direct','2019-11-09 00:00:00','2019-11-09 00:59:59',0.00123,'V']],
columns=['Program','Parameter','Station','Method','Start','End','Measurement','Flag'])
df
Program Parameter Station Method Start End Measurement Flag
0 Core PM2 1234 Direct 2019-11-08 00:00:00 2019-11-08 00:59:59 3.30000 V
1 Long Term Wind 1111 Direct 2019-11-09 00:00:00 2019-11-09 00:59:59 0.00123 V
然后,我为数据帧编制索引:
df_index = df.set_index(['Start','End','Measurement','Flag'])
df_index
这给了我:
Program Parameter Station Method
Start End Measurement Flag
2019-11-08 00:00:00 2019-11-08 00:59:59 3.30000 V Core PM2 1234 Direct
2019-11-09 00:00:00 2019-11-09 00:59:59 0.00123 V Long Term Wind 1111 Direct
然后,我为列创建一个多索引:
df_columns = pd.MultiIndex.from_frame(df_index[['Program','Parameter','Station','Method']])
然后,我使用多索引创建一个新的数据帧:
data = pd.DataFrame(df_index, index=df_index.index, columns=df_columns)
data
这给了我:
Program Core Long Term
Parameter PM2 Wind
Station 1234 1111
Method Direct Direct
Start End Measurement Flag
2019-11-08 00:00:00 2019-11-08 00:59:59 3.30000 V NaN NaN
2019-11-09 00:00:00 2019-11-09 00:59:59 0.00123 V NaN NaN
我想要的是让多索引列Program、Parameter、Station和Method将每个度量和标记分组在其下面,将开始和结束作为索引:
Program Core Long Term
Parameter PM2 Wind
Station 1234 1111
Method Direct Direct
Start End Measurement Flag Measurement Flag
2019-11-08 00:00:00 2019-11-08 00:59:59 3.30000 V
2019-11-09 00:00:00 2019-11-09 00:59:59 0.00123 V
任何帮助都将不胜感激
您可以尝试一系列堆叠/取消堆叠操作:
相关问题 更多 >
编程相关推荐