这是我的数据集。以下项目每天都有记录
香烟、烟草、零食/杂货店、饮料、牛奶、咖啡、日光浴、调制食品、国际食品、汽车/报纸、彩票刮刮、彩票机、Whl销售/礼品卡在每个日期重复
我想把这个框架转换成覆盖相同数据的框架,重复的部门作为列,日期作为索引,销售作为值。 我试着使用pivot表,但我意识到它改变了值和组合。 我就是这么想的,但结果出乎意料
dept = dept.pivot_table(values='Sales', index = dept.index, columns='Dept', aggfunc='first')
这是我想更改的原始数据帧
Date Dept Sales
2018-12-01 Cigarettes 426.889
2018-12-01 Tobacco 43.84
2018-12-01 Snack/Grocery 198.57
2018-12-01 Beverages 160.97
2018-12-01 Milk 11.56
2018-12-01 Coffee 29.72
2018-12-01 Solaray 9.99
2018-12-01 Prepared Foods 3.99
2018-12-01 International Food 65
2018-12-01 Sweets 0
2018-12-01 Automotive/News Paper 10.47
2018-12-01 Lottery - Scratch 1397
2018-12-01 Lottery - Machine 191
2018-12-01 Whl-Sales/Gift-Card 0
2018-12-01 Total 2549
2018-12-02 Cigarettes 374.01
2018-12-02 Tobacco 89.29
2018-12-02 Snack/Grocery 178.01
2018-12-02 Beverages 135.28
2018-12-02 Milk 9.57
2018-12-02 Coffee 33.76
2018-12-02 Solaray 17.99
2018-12-02 Prepared Foods 20.98
2018-12-02 International Food 3.98
2018-12-02 Sweets 0
2018-12-02 Automotive/News Paper 13.16
2018-12-02 Lottery - Scratch 651
2018-12-02 Lottery - Machine 211
2018-12-02 Whl-Sales/Gift-Card 0
2018-12-02 Total 1738.03
2018-12-03 Cigarettes 463.54
2018-12-03 Tobacco 35.26
2018-12-03 Snack/Grocery 164.19
2018-12-03 Beverages 126.01
2018-12-03 Milk 8.57
2018-12-03 Coffee 30.47
2018-12-03 Solaray 17.99
2018-12-03 Prepared Foods 0
2018-12-03 International Food 21.98
2018-12-03 Sweets 0
2018-12-03 Automotive/News Paper 70.17
2018-12-03 Lottery - Scratch 1046
2018-12-03 Lottery - Machine 461
2018-12-03 Whl-Sales/Gift-Card 0
2018-12-03 Total 2445.18
2018-12-03 Cigarettes 463.54
2018-12-03 Tobacco 35.26
2018-12-03 Snack/Grocery 164.19
2018-12-03 Beverages 126.01
2018-12-03 Milk 8.57
2018-12-03 Coffee 30.47
2018-12-03 Solaray 17.99
2018-12-03 Prepared Foods 0
2018-12-03 International Food 21.98
2018-12-03 Sweets 0
2018-12-03 Automotive/News Paper 70.17
2018-12-03 Lottery - Scratch 1046
2018-12-03 Lottery - Machine 461
2018-12-03 Whl-Sales/Gift-Card 0
2018-12-03 Total 2445.18
2018-12-04 Cigarettes 291.91
2018-12-04 Tobacco 42.93
2018-12-04 Snack/Grocery 207.87
2018-12-04 Beverages 163.11
2018-12-04 Milk 3.99
2018-12-04 Coffee 32.17
2018-12-04 Solaray 40.98
2018-12-04 Prepared Foods 5
2018-12-04 International Food 6.98
2018-12-04 Sweets 0
2018-12-04 Automotive/News Paper 47
2018-12-04 Lottery - Scratch 762
2018-12-04 Lottery - Machine 112.75
2018-12-04 Whl-Sales/Gift-Card NaN
2018-12-04 Total 1716.69
2018-12-05 Cigarettes 255.72
2018-12-05 Tobacco 81.52
2018-12-05 Snack/Grocery 212.94
2018-12-05 Beverages 87.94
2018-12-05 Milk 9.77
2018-12-05 Coffee 15.95
2018-12-05 Solaray 11.98
2018-12-05 Prepared Foods 8.98
2018-12-05 International Food 17.73
2018-12-05 Sweets 0
2018-12-05 Automotive/News Paper 46.24
2018-12-05 Lottery - Scratch 540
2018-12-05 Lottery - Machine 151
2018-12-05 Whl-Sales/Gift-Card NaN
2018-12-05 Total 1439.77
2018-12-06 Cigarettes 377.96
2018-12-06 Tobacco 129.07
2018-12-06 Snack/Grocery 281.83
2018-12-06 Beverages 235.73
2018-12-06 Milk 0
2018-12-06 Coffee 29.32
2018-12-06 Solaray 12.99
2018-12-06 Prepared Foods 27.37
2018-12-06 International Food 9.99
2018-12-06 Sweets 5
2018-12-06 Automotive/News Paper 32.92
2018-12-06 Lottery - Scratch 509
2018-12-06 Lottery - Machine 194
2018-12-06 Whl-Sales/Gift-Card NaN
2018-12-06 Total 1845.18
2018-12-07 Cigarettes 526.91
2018-12-07 Tobacco 65.71
2018-12-07 Snack/Grocery 202.27
2018-12-07 Beverages 183.59
2018-12-07 Milk 2.79
2018-12-07 Coffee 16.22
2018-12-07 Solaray 5.99
2018-12-07 Prepared Foods 24.98
2018-12-07 International Food 1.99
2018-12-07 Sweets 0
2018-12-07 Automotive/News Paper 31.06
2018-12-07 Lottery - Scratch 300
2018-12-07 Lottery - Machine 61.5
2018-12-07 Whl-Sales/Gift-Card 0
2018-12-07 Total 1423.01
一种方法是将索引设置为
['Date', 'Dept']
和unstack()
,但是对于日期2018-12-03
的每个Dept
有多个值请注意,如果这是预期的,但解决该问题的一种方法是
groupby().first()
获取第一个值,然后unstack()
,例如:但这几乎与
df.pivot_table(index='Date', columns='Dept', values='Sales')
相同:相关问题 更多 >
编程相关推荐