多索引和多列分组

2024-04-23 17:42:51 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据框架,有3级索引和2级列

                                       Group
                           Label       A        B       C        D
number      start          end              
1           2020-01-01  2020-12-31  -43.0    0      105.0   -37.0
            2020-12-15  2020-12-15  NaN     NaN      NaN    195.0
2           2019-01-01  2019-12-31  -35.0   80.0    -14.0   NaN
            2019-12-17  2019-12-17  NaN     NaN      NaN    141.0
            2020-01-01  2020-12-31  -15.0   45.0    -7.0    NaN
3           2020-12-17  2020-12-17  NaN     NaN      NaN    326.0
            2022-01-01  2022-12-31  NaN     50.0     NaN    NaN
            2023-12-31  2023-12-31  -25.0   NaN      NaN    NaN
            2023-01-01  2023-12-31  NaN    50.0      NaN    NaN            
            2020-12-15  2020-12-15  NaN     NaN      NaN    61.0
.............

我想按编号和开始(仅为年份)进行分组,对每个标签的值求和:

                                      Group
                           Label       A        B       C        D
number      start          end              
1           2020        2020        -43.0    0      105.0   232.0
2           2019        2019        -35.0   80.0    -14.0   141
            2020        2020        -15.0   45.0    -7.0    NaN
3           2020        2020        NaN     NaN      NaN    387.0
            2022        2022        NaN     50.0     NaN    NaN
            2023        2023        -25.0   50.0     NaN    NaN    
.............

请注意,还有更高级别的列(称为组,我不包括其他更高级别的列以保持简单)和其他子列(标签:A、B、C、D,对每个更高级别的列重复)。 我该怎么做? 先谢谢你


Tags: 数据框架numbergroup标签nanstartlabel
1条回答
网友
1楼 · 发布于 2024-04-23 17:42:51

您可以按名称引用多索引级别,并使用DatetimeIndex.year仅获取您关心的级别的年份min_count=1为所有缺失的组细胞提供NaN而不是0

df.groupby(['number', 
            df.index.get_level_values('start').year,
            df.index.get_level_values('end').year]).sum(min_count=1)

                      A     B      C      D
number start end                           
1      2020  2020 -43.0   0.0  105.0  158.0
2      2019  2019 -35.0  80.0  -14.0  141.0
       2020  2020 -15.0  45.0   -7.0    NaN
3      2020  2020   NaN   NaN    NaN  387.0
       2022  2022   NaN  50.0    NaN    NaN
       2023  2023 -25.0  50.0    NaN    NaN

相关问题 更多 >