Pandas:两列条件的累计和

2024-06-08 17:18:48 发布

您现在位置:Python中文网/ 问答频道 /正文

假设我有两个农场,A和B。那里每周都有不同的动物。我怎样才能得到目前在每个农场的动物的累计数量?你知道吗

+---+-----+--------+-----+--------+
|   |  A  | Farm_A |  B  | Farm_B |
+---+-----+--------+-----+--------+
| 0 | dog |   1    | cat |   1    |
| 1 | cat |   0    | dog |   1    |
| 2 | cat |   0    | dog |   1    |
| 3 | cat |   1    | dog |   0    |
| 4 | dog |   1    | dog |   1    |
| 5 | dog |   0    | dog |   0    |
| 6 | dog |   1    | cat |   1    |
+---+-----+--------+-----+--------+

通过groupby,我可以从每个农场获得cumsum:

df['A cumsum Farm_A'] = df.groupby(['A'])['Farm_A'].cumsum()
df['B cumsum Farm_B'] = df.groupby(['B'])['Farm_B'].cumsum()

+---+-----+--------+-----+--------+-----------------+-----------------+
|   |  A  | Farm_A |  B  | Farm_B | A cumsum Farm_A | B cumsum Farm_B |
+---+-----+--------+-----+--------+-----------------+-----------------+
| 0 | dog |   1    | cat |   1    |        1        |        1        |
| 1 | cat |   0    | dog |   1    |        0        |        1        |
| 2 | cat |   0    | dog |   1    |        0        |        2        |
| 3 | cat |   1    | dog |   0    |        1        |        2        |
| 4 | dog |   1    | dog |   1    |        2        |        3        |
| 5 | dog |   0    | dog |   0    |        2        |        3        |
| 6 | dog |   1    | cat |   1    |        3        |        2        |
+---+-----+--------+-----+--------+-----------------+-----------------+

我的问题是,我怎样才能得到A农场和B农场每行动物的累计总数?你知道吗

例如第3行: 农场A的动物是猫,那么我想要农场A和B的猫的总和,从第0,1,2,3行开始=2只猫。你知道吗

在第3行,B农场的动物是狗,那么我想从第0,1,2,3行得到两个农场的狗总数=3

这就是我想要达到的目标:

+---+-----+--------+-----+--------+-----------------+-----------------+-----------------+-----------------+
|   |  A  | Farm_A |  B  | Farm_B | A cumsum Farm_A | B cumsum Farm_B | A at both farms | B at both farms |
+---+-----+--------+-----+--------+-----------------+-----------------+-----------------+-----------------+
| 0 | dog |   1    | cat |   1    |        1        |        1        |        1        |        1        |
| 1 | cat |   0    | dog |   1    |        0        |        1        |        1        |        2        |
| 2 | cat |   0    | dog |   1    |        0        |        2        |        1        |        3        |
| 3 | cat |   1    | dog |   0    |        1        |        2        |        2        |        3        |
| 4 | dog |   1    | dog |   1    |        2        |        3        |        4        |        5        |
| 5 | dog |   0    | dog |   0    |        2        |        3        |        5        |        5        |
| 6 | dog |   1    | cat |   1    |        3        |        2        |        6        |        3        |
+---+-----+--------+-----+--------+-----------------+-----------------+-----------------+-----------------+

Tags: 目标df数量atcat农场动物groupby
1条回答
网友
1楼 · 发布于 2024-06-08 17:18:48

最后两列可以使用虚拟对象创建。这允许您跨农场为每种动物类型创建一个cumsum,然后lookup为每一行获取适当的值。你知道吗

import pandas as pd

res = pd.get_dummies(df, columns=['A', 'B'])
# Animals only count if dummy & exists, so need to multiply.
res = pd.concat([res.filter(like='A_').multiply(res.Farm_A, axis=0),
                 res.filter(like='B_').multiply(res.Farm_B, axis=0)],
                axis=1)
# Cumsum per animal
res = res.groupby(res.columns.str.split('_').str[1], axis=1).apply(lambda x: x.sum(1).cumsum())
#   cat  dog
#0    1    1
#1    1    2
#2    1    3
#3    2    3
#4    2    5
#5    2    5
#6    3    6

# Lookup
df['A at both'] = res.lookup(df.index, df.A)
df['B at both'] = res.lookup(df.index, df.B)

输出

     A  Farm_A    B  Farm_B  A at both  B at both
0  dog       1  cat       1          1          1
1  cat       0  dog       1          1          2
2  cat       0  dog       1          1          3
3  cat       1  dog       0          2          3
4  dog       1  dog       1          5          5
5  dog       0  dog       0          5          5
6  dog       1  cat       1          6          3

相关问题 更多 >