如何仅使用数据框操作获取特定的唯一组合?

3 投票
1 回答
65 浏览
提问于 2025-04-14 18:37

我有一些数据,记录了两个玩家轮流玩游戏的情况。每当他们各自赢或输后,就会得到一个共享的分数(这里的逻辑不重要,数字都是随机的,这只是用来说明我想要的内容)。

所以,对于玩家P1和P2的每一种可能结果,都有对应的分数。

游戏的具体规则不重要,我只想知道是否可以用我最初的数据框,创建一个新的数据框,包含这4个玩家所有独特的组合。也就是说,计算这4个玩家一起玩时的所有可能组合的分数,假设他们的总分是相加的。

举个例子:

Player_1 Player_2 Player_3 Player_4 Outcome_1 Outcome_2  Outcome_3 Outcome_4 Score
P1       P2       P3        P4         win       win       win       win      72

还有其他可能的独特组合。

关键是从P1和P2都赢的组合中得到30分,从P3和P4都赢的组合中得到42分,然后把这两个分数相加,得到如果这4个玩家都赢了的总分。

我可以通过生成独特组合等方法来做到这一点,但在实际应用中,如果参数更大,代码会变得很长,而且难以阅读。我想知道有没有办法只用一些操作,比如合并、分组、连接、聚合等,来实现这个目标。

import pandas as pd

data = {
    "Player_1": ["P1", "P1", "P1", "P1", "P2", "P2", "P2", "P2", "P1", "P1", "P1", "P1", "P3", "P3", "P3", "P3"],
    "Player_2": ["P2", "P2", "P2", "P2", "P3", "P3", "P3", "P3", "P4", "P4", "P4", "P4", "P4", "P4", "P4", "P4"],
    "Outcome_1": ["win", "win", "lose", "lose", "win", "win", "lose", "lose", "win", "win", "lose", "lose", "win", "win", "lose", "lose"],
    "Outcome_2": ["win", "lose", "win", "lose", "win", "lose", "win", "lose", "win", "lose", "win", "lose", "win", "lose", "win", "lose"],
    "Score": [30, 45, 12, 78, 56, 21, 67, 90, 15, 32, 68, 88, 42, 74, 8, 93]
}

df = pd.DataFrame(data)

print(df)

   Player_1 Player_2 Outcome_1 Outcome_2  Score
0        P1       P2       win       win     30
1        P1       P2       win      lose     45
2        P1       P2      lose       win     12
3        P1       P2      lose      lose     78
4        P2       P3       win       win     56
5        P2       P3       win      lose     21
6        P2       P3      lose       win     67
7        P2       P3      lose      lose     90
8        P1       P4       win       win     15
9        P1       P4       win      lose     32
10       P1       P4      lose       win     68
11       P1       P4      lose      lose     88
12       P3       P4       win       win     42
13       P3       P4       win      lose     74
14       P3       P4      lose       win      8
15       P3       P4      lose      lose     93

1 个回答

3

我希望我理解了你的问题。从评论来看,我猜测你的数据表里总共有4个玩家:

from itertools import product

p1, p2, p3, p4 = np.unique(df[["Player_1", "Player_2"]].values)
df = df.set_index(["Player_1", "Player_2", "Outcome_1", "Outcome_2"])

all_data = []
for p1o1, p2o2, p3o1, p4o2 in product(["win", "lose"], repeat=4):
    all_data.append(
        (
            p1,
            p2,
            p3,
            p4,
            p1o1,
            p2o2,
            p3o1,
            p4o2,
            df.loc[(p1, p2, p1o1, p2o2), "Score"]
            + df.loc[(p3, p4, p3o1, p4o2), "Score"],
        )
    )

out = pd.DataFrame(
    all_data,
    columns=[
        "Player_1",
        "Player_2",
        "Player_3",
        "Player_4",
        "Outcome_1",
        "Outcome_2",
        "Outcome_3",
        "Outcome_4",
        "Score",
    ],
)

打印结果:

   Player_1 Player_2 Player_3 Player_4 Outcome_1 Outcome_2 Outcome_3 Outcome_4  Score
0        P1       P2       P3       P4       win       win       win       win     72
1        P1       P2       P3       P4       win       win       win      lose    104
2        P1       P2       P3       P4       win       win      lose       win     38
3        P1       P2       P3       P4       win       win      lose      lose    123
4        P1       P2       P3       P4       win      lose       win       win     87
5        P1       P2       P3       P4       win      lose       win      lose    119
6        P1       P2       P3       P4       win      lose      lose       win     53
7        P1       P2       P3       P4       win      lose      lose      lose    138
8        P1       P2       P3       P4      lose       win       win       win     54
9        P1       P2       P3       P4      lose       win       win      lose     86
10       P1       P2       P3       P4      lose       win      lose       win     20
11       P1       P2       P3       P4      lose       win      lose      lose    105
12       P1       P2       P3       P4      lose      lose       win       win    120
13       P1       P2       P3       P4      lose      lose       win      lose    152
14       P1       P2       P3       P4      lose      lose      lose       win     86
15       P1       P2       P3       P4      lose      lose      lose      lose    171

撰写回答