计算Pandas的可能分组

2024-04-20 10:38:36 发布

您现在位置:Python中文网/ 问答频道 /正文

我不知道如何有效地计算熊猫栏中可能出现的群体。 我想将客户购买中重复次数最多的产品分类。 例如:

^{tb1}$
Groups = {A,B}, {C,D}, {A,B,C}, {B,C}, {B,C,D}
Count of Group {A,B} = 3 (Client 1-3-5)
Count of Group {C,D} = 3 (Client 2-4)
Count of Group {A,B,C} = 2 (Client 3-5)
Count of Group {B,C} = 3 (Client 2-3-5)
Count of Group {B,C,D} = 2 (Client 2,5)

1条回答
网友
1楼 · 发布于 2024-04-20 10:38:36

让我们尝试get_dummies来分隔Products,然后循环遍历组并计数:

Groups = [{'A','B'}, {'C','D'}, {'A','B','C'}, {'B','C'}, {'B','C','D'}]
s = df.Product.str.get_dummies(',')
out = pd.Series([s[list(group)].all(1).sum() for group in Groups], 
                index=list(map(tuple, Groups)))

输出:

(A, B)       3
(C, D)       3
(C, A, B)    2
(C, B)       3
(C, D, B)    2
dtype: int64

相关问题 更多 >