pd.groupby()中介和

2024-05-29 10:01:57 发布

您现在位置:Python中文网/ 问答频道 /正文

在Python 3.6、1.1.2上

我在努力做中间价。我当然可以使用总和(水平),但这不是优雅的,也不是最优的,我想知道是否有更好的方法。 例如:

df = pd.DataFrame.from_dict({'level_0': {0: 'a', 1: 'a', 2: 'a', 3: 'b', 4: 'b', 5: 'b', 6: 'b', 7: 'b', 8: 'c', 9: 'c', 10: 'c', 11: 'c', 12: 'c', 13: 'c', 14: 'c', 15: 'c'}, 'level_1': {0: 'aa', 1: 'aa', 2: 'bb', 3: 'aa', 4: 'aa', 5: 'aa', 6: 'cc', 7: 'cc', 8: 'bb', 9: 'bb', 10: 'cc', 11: 'cc', 12: 'cc', 13: 'dd', 14: 'dd', 15: 'dd'}, 'level_2': {0: 'aaa', 1: 'aab', 2: 'bba', 3: 'aaa', 4: 'aab', 5: 'aac', 6: 'cca', 7: 'ccb', 8: 'bba', 9: 'bbb', 10: 'cca', 11: 'ccb', 12: 'ccc', 13: 'dda', 14: 'ddb', 15: 'ddc'}, 'value': {0: 5, 1: 2, 2: 3, 3: 5, 4: 9, 5: 2, 6: 2, 7: 9, 8: 1, 9: 9, 10: 9, 11: 5, 12: 5, 13: 5, 14: 5, 15: 3}}).groupby(by=['level_0', 'level_1', 'level_2']).sum()

给我:

                         value
level_0 level_1 level_2       
a       aa      aaa          5
                aab          2
        bb      bba          3
b       aa      aaa          5
                aab          9
                aac          2
        cc      cca          2
                ccb          9
c       bb      bba          1
                bbb          9
        cc      cca          9
                ccb          5
                ccc          5
        dd      dda          5
                ddb          5
                ddc          3

现在,我希望能够获得每个级别0和级别1的小计,如下所示: example


Tags: levelddaaccbbbcccbbddb
1条回答
网友
1楼 · 发布于 2024-05-29 10:01:57

给你:

import pandas as pd

df = pd.DataFrame.from_dict({'level_0': {0: 'a', 1: 'a', 2: 'a', 3: 'b', 4: 'b', 5: 'b', 6: 'b', 7: 'b', 8: 'c', 9: 'c', 10: 'c', 11: 'c', 12: 'c', 13: 'c', 14: 'c', 15: 'c'}, 'level_1': {0: 'aa', 1: 'aa', 2: 'bb', 3: 'aa', 4: 'aa', 5: 'aa', 6: 'cc', 7: 'cc', 8: 'bb', 9: 'bb', 10: 'cc', 11: 'cc', 12: 'cc', 13: 'dd', 14: 'dd', 15: 'dd'}, 'level_2': {0: 'aaa', 1: 'aab', 2: 'bba', 3: 'aaa', 4: 'aab', 5: 'aac', 6: 'cca', 7: 'ccb', 8: 'bba', 9: 'bbb', 10: 'cca', 11: 'ccb', 12: 'ccc', 13: 'dda', 14: 'ddb', 15: 'ddc'},
                             'value': {0: 5, 1: 2, 2: 3, 3: 5, 4: 9, 5: 2, 6: 2, 7: 9, 8: 1, 9: 9, 10: 9, 11: 5, 12: 5, 13: 5, 14: 5, 15: 3}})

gb1 = df.groupby(by=['level_0', 'level_1', 'level_2']).sum().reset_index()
gb2 = df.groupby(by=['level_0', 'level_1']).sum().reset_index()
gb3 = df.groupby(by=['level_0']).sum().reset_index()

gb2['level_2'] = ''
gb3['level_1'] = ''
gb3['level_2'] = ''

gb_all = pd.concat((gb1, gb2, gb3), axis=0)
gb_all.sort_values(['level_0', 'level_1', 'level_2'], inplace=True)
gb_all.reset_index(inplace=True, drop=True)

print(gb_all)

输出:

   level_0 level_1 level_2  value
0        a                     10
1        a      aa              7
2        a      aa     aaa      5
3        a      aa     aab      2
4        a      bb              3
5        a      bb     bba      3
6        b                     27
7        b      aa             16
8        b      aa     aaa      5
9        b      aa     aab      9
10       b      aa     aac      2
11       b      cc             11
12       b      cc     cca      2
13       b      cc     ccb      9
14       c                     42
15       c      bb             10
16       c      bb     bba      1
17       c      bb     bbb      9
18       c      cc             19
19       c      cc     cca      9
20       c      cc     ccb      5
21       c      cc     ccc      5
22       c      dd             13
23       c      dd     dda      5
24       c      dd     ddb      5
25       c      dd     ddc      3

相关问题 更多 >

    热门问题