如何按值计数分组或排除?

2024-04-18 07:55:21 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个df,groupby看起来像这样

   +----------------+----------------+-------------+
   | Team           | Method         |  Count      |
   +----------------+----------------+-------------+
   | Team 1         | Manual         |          14 |
   | Team 2         | Automated      |           5 |
   | Team 2         | Hybrid         |           1 |
   | Team 2         | Manual         |          25 |
   | Team 4         | Automated      |           1 |
   | Team 4         | Hybrid         |          13 |
   +----------------+----------------+-------------+

我想通过显示只有手动方法的团队来创建计数或分组。你知道我该怎么做吗?你知道吗

对于这个数据集,答案是team1,因为他们是唯一一个只使用手工方法的团队。你知道吗


Tags: 数据方法答案dfcount手动团队manual
1条回答
网友
1楼 · 发布于 2024-04-18 07:55:21

您可以使用^{}^{}表示所有值Manualby ^{},并使用subset by ^{}^{}

print df
      Team     Method
0   Team 1     Manual
1   Team 1     Manual
2   Team 1     Manual
3   Team 1     Manual
4   Team 1     Manual
5   Team 2  Automated
6   Team 2  Automated
7   Team 2  Automated
8   Team 2  Automated
9   Team 2  Automated
10  Team 2     Hybrid
11  Team 2     Manual
12  Team 2     Manual
13  Team 3     Manual
14  Team 2     Manual
15  Team 2     Manual
16  Team 4  Automated
17  Team 4     Hybrid
g = df.groupby("Team")['Method'].apply( lambda x: (x == 'Manual').all())
print g
Team
Team 1     True
Team 2    False
Team 3     True
Team 4    False
Name: Method, dtype: bool

print g[g.values].index
Index([u'Team 1', u'Team 3'], dtype='object', name=u'Team')

print df.loc[df['Team'].isin(g[g.values].index)]
      Team  Method
0   Team 1  Manual
1   Team 1  Manual
2   Team 1  Manual
3   Team 1  Manual
4   Team 1  Manual
13  Team 3  Manual

为了更好地理解apply,您可以将自定义函数fprint一起使用,它将组的每个项与字符串Manual进行比较:

def f(x):
    print(x == 'Manual')

print df.groupby("Team")['Method'].apply(f)
0    True
1    True
2    True
3    True
4    True
Name: Team 1, dtype: bool
5     False
6     False
7     False
8     False
9     False
10    False
11     True
12     True
13     True
14     True
Name: Team 2, dtype: bool
15    False
16    False
Name: Team 4, dtype: bool

但是我们需要检查所有值是否都是字符串Manual——这意味着我们需要检查所有值是否都是Trueall

def f(x):
    print(x == 'Manual').all()

print df.groupby("Team")['Method'].apply(f)
True
False
False

编辑:我添加了带有groupby的示例,有两列:

print df
      Col1     Method  Col2
0   Team 1     Manual  Team
1   Team 1     Manual  Team
2   Team 1     Manual  Team
3   Team 1     Manual  Team
4   Team 1     Manual  Team
5   Team 2  Automated  Team
6   Team 2  Automated  Team
7   Team 2  Automated  Team
8   Team 2  Automated  Team
9   Team 2  Automated  Team
10  Team 2     Hybrid  Team
11  Team 2     Manual  Team
12  Team 2     Manual  Team
13  Team 3     Manual  Team
14  Team 2     Manual  Team
15  Team 2     Manual  Team
16  Team 4  Automated  Team
17  Team 4     Hybrid  Team
g = df.groupby(["Col1", "Col2"])['Method'].apply(lambda x: (x == 'Manual').all())
print g
Col1    Col2
Team 1  Team     True
Team 2  Team    False
Team 3  Team     True
Team 4  Team    False
Name: Method, dtype: bool

g =  g.reset_index()
print g
     Col1  Col2 Method
0  Team 1  Team   True
1  Team 2  Team  False
2  Team 3  Team   True
3  Team 4  Team  False

g1 = g.loc[g['Method'], 'Col1']
print g1
0    Team 1
2    Team 3
Name: Col1, dtype: object

print df.loc[df['Col1'].isin(g1.values)]
      Col1  Method  Col2
0   Team 1  Manual  Team
1   Team 1  Manual  Team
2   Team 1  Manual  Team
3   Team 1  Manual  Team
4   Team 1  Manual  Team
13  Team 3  Manual  Team

相关问题 更多 >