获取groupby中tuplevalued列的idxmax或idxmin

2024-05-19 00:44:26 发布

男 | 程序猿一只，喜欢编程写python代码。

我有一个元组值的分数，我想得到对应于最大值的行。我想做的一个玩具例子是：

import pandas as pd
df = pd.DataFrame({'id': ['a', 'a', 'b', 'b'], 
                   'score': [(1,1,1), (1,1,2), (0, 0, 100), (8,8,8)], 
                   'numeric_score': [1, 2, 3, 4],
                   'value':['foo', 'bar', 'baz', 'qux']})
# Works, gives correct result:
correct_df = df.loc[df.groupby('id')['numeric_score'].idxmax(), :]
# Fails with a TypeError
goal_df = df.loc[df.groupby('id')['score'].idxmax(), :]

correct_df有我想要的结果。这会引发一系列错误，其核心似乎是：

TypeError: reduction operation 'argmax' not allowed for this dtype

一个可行但丑陋的解决方案是：

best_scores = df.groupby('id')['score'].max().reset_index()[['id', 'score']]
goal_df = (pd.merge(df, best_scores, on=['id', 'score'])
           .groupby(['id'])
           .first()
           .reset_index())

有没有一个圆滑的版本？你知道吗

Tags： id df index loc pd best score reset

1条回答

网友

1楼 · 发布于 2024-05-19 00:44:26

我理解你的问题是：

“NumPy的.argmax()不适用于元组。对于一系列元组，如何确定最大值元组的索引？”你知道吗

IIUC，这将返回所需的结果：

df.loc[df.score == df.score.max()]

获取groupby中tuplevalued列的idxmax或idxmin

相关问题更多 >

编程相关推荐

热门问题

热门文章

获取groupby中tuplevalued列的idxmax或idxmin

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >