Python - 数据反透视
我在找一个用python
的解决方案。需要帮助把一个数据框进行“反透视”,也就是把数据的结构调整一下。现在的数据结构有点奇怪,不太适合用基本的透视功能来重塑。
当前的数据框 - 这是我现在的样子
ABC Mechanical
Standard 15-Day 10-Day 5-Day
Terminal Units 0.49 0.75 0.69 0.63
Diffusers 0.35 0.55 0.45 0.4
Vent 0.8 0.95 0.9 0.85
Piping 0.7 0.85 0.8 0.75
Stoves 0.6 0.8 0.75 0.7
反透视后的数据框 - 这是我想要的样子
df.columns= Customer, Product Category, Ship Cycle, Multiplier
df.index= ABC Mechanical
Customer Product Category Ship Cycle Multiplier
ABC Mechanical Terminal Units Standard 0.49
ABC Mechanical Terminal Units 15-Day 0.75
ABC Mechanical Terminal Units 10-Day 0.69
ABC Mechanical Terminal Units 5-Day 0.63
ABC Mechanical Diffusers Standard 0.35
ABC Mechanical Diffusers 15-Day 0.55
ABC Mechanical Diffusers 10-Day 0.45
ABC Mechanical Diffusers 5-Day 0.4
任何帮助都非常感谢!
谢谢!
2 个回答
2
或者你可以使用 unstack()
:
In [41]:
print df
Standard 15-Day 10-Day 5-Day
Terminal_Units 0.49 0.75 0.69 0.63
Diffusers 0.35 0.55 0.45 0.40
Vent 0.80 0.95 0.90 0.85
Piping 0.70 0.85 0.80 0.75
Stoves 0.60 0.80 0.75 0.70
In [42]:
df2 = df.stack().reset_index()
df2.columns = ['Product Category', 'Ship Cycle', 'Multiplier']
df2['Customer'] = 'ABC Mechanical'
In [43]:
print df2.icol([3,0,1,2])
Customer Product Category Ship Cycle Multiplier
0 ABC Mechanical Terminal_Units Standard 0.49
1 ABC Mechanical Terminal_Units 15-Day 0.75
2 ABC Mechanical Terminal_Units 10-Day 0.69
3 ABC Mechanical Terminal_Units 5-Day 0.63
4 ABC Mechanical Diffusers Standard 0.35
5 ABC Mechanical Diffusers 15-Day 0.55
6 ABC Mechanical Diffusers 10-Day 0.45
7 ABC Mechanical Diffusers 5-Day 0.40
8 ABC Mechanical Vent Standard 0.80
9 ABC Mechanical Vent 15-Day 0.95
10 ABC Mechanical Vent 10-Day 0.90
11 ABC Mechanical Vent 5-Day 0.85
12 ABC Mechanical Piping Standard 0.70
13 ABC Mechanical Piping 15-Day 0.85
14 ABC Mechanical Piping 10-Day 0.80
15 ABC Mechanical Piping 5-Day 0.75
16 ABC Mechanical Stoves Standard 0.60
17 ABC Mechanical Stoves 15-Day 0.80
18 ABC Mechanical Stoves 10-Day 0.75
19 ABC Mechanical Stoves 5-Day 0.70
6
如果 df
看起来像这样:
In [26]: df
Out[26]:
Standard 15-Day 10-Day 5-Day
Terminal Units 0.49 0.75 0.69 0.63
Diffusers 0.35 0.55 0.45 0.40
Vent 0.80 0.95 0.90 0.85
Piping 0.70 0.85 0.80 0.75
Stoves 0.60 0.80 0.75 0.70
那么 pd.melt
可以让你接近想要的结果数据表:
In [27]: pd.melt(df.reset_index(), id_vars=['index']).sort_values(by=['index'])
Out[27]:
index variable value
1 Diffusers Standard 0.35
6 Diffusers 15-Day 0.55
11 Diffusers 10-Day 0.45
16 Diffusers 5-Day 0.40
3 Piping Standard 0.70
8 Piping 15-Day 0.85
13 Piping 10-Day 0.80
18 Piping 5-Day 0.75
4 Stoves Standard 0.60
9 Stoves 15-Day 0.80
14 Stoves 10-Day 0.75
19 Stoves 5-Day 0.70
0 Terminal Units Standard 0.49
5 Terminal Units 15-Day 0.75
10 Terminal Units 10-Day 0.69
15 Terminal Units 5-Day 0.63
2 Vent Standard 0.80
7 Vent 15-Day 0.95
12 Vent 10-Day 0.90
17 Vent 5-Day 0.85
我不太明白“ABC Mechanical”在原始数据表中在哪里,所以我没有尝试把它包含在结果里。你可以这样重命名列名:
In [28]: df = pd.melt(df.reset_index(), id_vars=['index']).sort_values(by=['index'])
In [29]: df.columns = ['Product Category', 'Ship Cycle', 'Multiplier']
In [31]: df.head()
Out[31]:
Product Category Ship Cycle Multiplier
1 Diffusers Standard 0.35
6 Diffusers 15-Day 0.55
11 Diffusers 10-Day 0.45
16 Diffusers 5-Day 0.40
3 Piping Standard 0.70