使用python展开/取消堆叠excel数据透视

2024-04-25 07:10:44 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个excel数据透视表,数据如下:

Code      Region   Detail      Oct'17    Sep'17    Aug'17
AXISCGF   zone 1   IND3D01024  -82,000   0         900,000 
AXISDEF   zone 5   INP467B029   85,000   182,000   0
AXISEAF   zone 4   INZ514ELY4  -13,500   0         5,00,000 
AXISEQF   zone 2   INQ916D14E   0       -500,000   25,400 

我想要的结果是:

 Code      Region   Detail      Month    Change
 AXISCGF   zone 1   IND3D01024  Oct'17  -82,000   
 AXISDEF   zone 5   INP467B029  Oct'17   85,000
 ..
 AXISCGF   zone 1   IND3D01024  Sep'17   0
 AXISDEF   zone 5   INP467B029  Sep'17   182,000    

我希望前三列保持原样,然后附加从第四列开始的所有列。我可以在第4列循环到最后。我的代码:

import pandas
df = pandas.read_excel('Data_2017.xlsx', sheet_name='Sales')

for data_column in df.columns[3:10]:
    df_new = df.filter(['Code','Region','Detail',data_column])
    #print df_new.head()
    df_new.to_csv('_TEST_Output.csv',mode='a')

输出:

     Code      Region   Detail      Oct'17    
 0   AXISCGF   zone 1   IND3D01024  -82,000   
 1   AXISDEF   zone 5   INP467B029   85,000   
..
3501 AXISESF   zone 3   INO0201062   0
     Code      Region   Detail      Sep'17    
 0   AXISDEF   zone 5   INP467B029  182,000

如何使用pandas获得所需的输出?你知道吗


Tags: 数据zonepandasdfnewcodeexcelregion
1条回答
网友
1楼 · 发布于 2024-04-25 07:10:44

使用^{}

df = df.melt(['Code','Region','Detail'], value_name='Change', var_name='Month')
print (df)
       Code  Region      Detail   Month    Change
0   AXISCGF  zone 1  IND3D01024  Oct'17   -82,000
1   AXISDEF  zone 5  INP467B029  Oct'17    85,000
2   AXISEAF  zone 4  INZ514ELY4  Oct'17   -13,500
3   AXISEQF  zone 2  INQ916D14E  Oct'17         0
4   AXISCGF  zone 1  IND3D01024  Sep'17         0
5   AXISDEF  zone 5  INP467B029  Sep'17   182,000
6   AXISEAF  zone 4  INZ514ELY4  Sep'17         0
7   AXISEQF  zone 2  INQ916D14E  Sep'17  -500,000
8   AXISCGF  zone 1  IND3D01024  Aug'17   900,000
9   AXISDEF  zone 5  INP467B029  Aug'17         0
10  AXISEAF  zone 4  INZ514ELY4  Aug'17  5,00,000
11  AXISEQF  zone 2  INQ916D14E  Aug'17    25,400

另一个解决方案是^{},但列中的值顺序不同:

df = (df.set_index(['Code','Region','Detail'])
       .stack()
       .reset_index(name='Change')
       .rename(columns={'level_3':'Month'}))

print (df)
       Code  Region      Detail   Month    Change
0   AXISCGF  zone 1  IND3D01024  Oct'17   -82,000
1   AXISCGF  zone 1  IND3D01024  Sep'17         0
2   AXISCGF  zone 1  IND3D01024  Aug'17   900,000
3   AXISDEF  zone 5  INP467B029  Oct'17    85,000
4   AXISDEF  zone 5  INP467B029  Sep'17   182,000
5   AXISDEF  zone 5  INP467B029  Aug'17         0
6   AXISEAF  zone 4  INZ514ELY4  Oct'17   -13,500
7   AXISEAF  zone 4  INZ514ELY4  Sep'17         0
8   AXISEAF  zone 4  INZ514ELY4  Aug'17  5,00,000
9   AXISEQF  zone 2  INQ916D14E  Oct'17         0
10  AXISEQF  zone 2  INQ916D14E  Sep'17  -500,000
11  AXISEQF  zone 2  INQ916D14E  Aug'17    25,400

相关问题 更多 >