在pandas中将多重索引数据框转换为扁平数据框
我在pandas中有一个叫做groupt3的多重索引数据框,当我输入groupt3.head()时,它看起来是这样的:
datetime song sum rat
artist datetime
2562 8 2 2 26 0
46 19 19 26 0
47 3 3 26 0
4Hero 1 2 2 32 0
26 20 20 32 0
9 10 10 32 0
我想要一个“扁平化”的数据框,把艺术家索引和日期时间索引“重复”一下,形成这个样子:
artist date time song sum rat
2562 8 2 26 0
2562 46 19 26 0
2562 47 3 26 0
等等...
谢谢。
2 个回答
12
我觉得你可以使用 reset_index
这个方法:
import pandas as pd
import numpy as np
np.random.seed(0)
arrays = [['Monday','Monday','Tursday','Tursday'],
['Morning','Noon','Morning','Evening']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['Weekday', 'Time'])
df = pd.DataFrame(np.random.randint(5, size=(4,2)), index=index)
print df
0 1
Weekday Time
Monday Morning 4 0
Noon 3 3
Tursday Morning 3 1
Evening 3 2
print df.reset_index()
Weekday Time 0 1
0 Monday Morning 4 0
1 Monday Noon 3 3
2 Tursday Morning 3 1
3 Tursday Evening 3 2
13
使用 pandas.DataFrame.to_records() 方法。
示例:
import pandas as pd
import numpy as np
arrays = [['Monday','Monday','Tursday','Tursday'],
['Morning','Noon','Morning','Evening']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['Weekday', 'Time'])
df = pd.DataFrame(np.random.randint(5, size=(4,2)), index=index)
In [39]: df
Out[39]:
0 1
Weekday Time
Monday Morning 1 3
Noon 2 1
Tursday Morning 3 3
Evening 1 2
In [40]: pd.DataFrame(df.to_records())
Out[40]:
Weekday Time 0 1
0 Monday Morning 1 3
1 Monday Noon 2 1
2 Tursday Morning 3 3
3 Tursday Evening 1 2