通过布尔比较对列进行分组（类似于使用Matlab的grpstats）

0 投票

1 回答

664 浏览

提问于 2025-04-18 11:58

我在Pandas中有一个数据框，格式如下（还有很多其他列）

   chip  WL     ok
0     1   1   True
1     1   2   True
2     1   3   True
3     1   4   True
4     2   1  False
5     2   2   True
6     2   3   True
7     2   4   True

我想按照芯片（chip）进行分组，统计每个chip的WL数量，并对每个ok列中的值进行逻辑and运算。期望的输出应该是这样的：

   chip  WLs     ok
0     1   4    True
1     2   4   False

在Matlab中，可以通过以下命令来实现：

a = grpstats(CellYield,{'chip'},{@all},'DataVars',{'ok'});
a.Properties.VarNames{2} = 'WLs';
a.Properties.VarNames{3} = 'ok';

这将输出一个像这样的数据集：

chip WLs    ok
1    4      True
2    4      False

我该如何在Python和Pandas中做到这一点呢？

逻辑运算 pandas 数据框数据分组布尔比较 WL数量

1 个回答

使用 groupby 可以对数据进行分组，并且我们可以传入一个字典，里面包含要对每一列应用的函数。对于 WL 这一列，我们使用 pandas.Series 中的 count 函数，而 all 函数则是对所有值进行测试，如果这一列的所有值都是 True，那么返回 True，否则返回 False。

In [6]:

df.groupby('chip').agg({'WL':pd.Series.count, 'ok':all})

Out[6]:
      WL     ok
chip           
1      4   True
2      4  False

[2 rows x 2 columns]

更新

为了把这些值重新赋值回原来的数据框，你可以使用 transform，不过我没能找到方法让 transform 对不同的列应用不同的函数，因为它不支持 agg 函数或者用户自定义的函数。

所以你可以选择分两步来完成，像这样：

In [30]:

df['WL'] = df.groupby('chip')['WL'].transform('count')
df['ok'] = df.groupby('chip')['ok'].transform('all')
df
Out[30]:
       chip  WL     ok    foo    bar
index                               
0         1   4   True  hello  world
1         1   4   True  hello  world
2         1   4   True  hello  world
3         1   4   True  hello  world
4         2   4  False  hello  world
5         2   4  False  hello  world
6         2   4  False  hello  world
7         2   4  False  hello  world

[8 rows x 5 columns]

回答于 2025-04-18 由 Python大师

分享举报

通过布尔比较对列进行分组（类似于使用Matlab的grpstats）

1 个回答

撰写回答