筛选至少有一行满足条件的GroupBy对象

test_df.groupby(['Category', 'Subcategory'])['Value'].sum() # Output is this Category Subcategory P A 2.0 B 5.0 C 8.0 Q A 2.0 B 1.0 C 1.0

2条回答

网友

1楼 · 编辑于 2024-04-25 10:15:41

使用loc：

s = test_df.groupby(['Category', 'Subcategory'])['Value'].sum()
s.loc[s[s.ge(3)].index.get_level_values(0).unique()].reset_index()

  Category Subcategory  Value
0        P           A    2.0
1        P           B    5.0
2        P           C    8.0

网友

2楼 · 编辑于 2024-04-25 10:15:41

用途：

mask = (test_df['Category'].isin(test_df.loc[test_df['Value'] >= 3, 'Category'].unique())
a = test_df[mask]
print (a)
  Category Subcategory  Value
0        P           A    2.0
1        P           B    5.0
2        P           C    8.0

首先按条件获取所有Category值：

print (test_df.loc[test_df['Value'] >= 3, 'Category'])
1    P
2    P
Name: Category, dtype: object

要获得更好的性能，请创建unique值，感谢@Sandeep Kadapa：

print (test_df.loc[test_df['Value'] >= 3, 'Category'].unique())
['P']

然后按^{}过滤原始列：

print (test_df['Category'].isin(test_df.loc[test_df['Value'] >= 3, 'Category'].unique()))
0     True
1     True
2     True
3    False
4    False
5    False
Name: Category, dtype: bool

在groupby之后用MultiIndex过滤序列的相同解决方案：

s = test_df.groupby(['Category', 'Subcategory'])['Value'].sum()
print (s)
Category  Subcategory
P         A              2.0
          B              5.0
          C              8.0
Q         A              2.0
          B              1.0
          C              1.0
Name: Value, dtype: float64

idx0 = s.index.get_level_values(0)
a = s[idx0.isin(idx0[s >= 3].unique())]
print (a)
Category  Subcategory
P         A              2.0
          B              5.0
          C              8.0
Name: Value, dtype: float64

相关问题更多 >

编程相关推荐

热门问题

热门文章

筛选至少有一行满足条件的GroupBy对象

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >