使用Pandas的groupby只是为了删除重复的项目

网友

1楼 · 编辑于 2024-04-20 01:19:44

这一点让我觉得这可能是你想要的答案：

Is there a different workaround without even using groupby

如果您只想删除基于Fruit的重复行，.drop_duplicates就是最好的方法。在

df.drop_duplicates(subset='Fruit')

     Name   Fruit  Amount
1    Jack   Lemon       3
2    Mary  Banana       6
4  Sophie  Cherry      10

您对保留哪些行的控制有限，请参见docstring。在

这比groupby+first更快、更可读。在

网友

2楼 · 编辑于 2024-04-20 01:19:44

IIUC您可以使用^{}，它将返回DataFrame：

In [140]: df.pivot_table(index='Fruit')
Out[140]:
        Amount
Fruit
Banana       4
Cherry       7
Lemon        2

In [141]: type(df.pivot_table(index='Fruit'))
Out[141]: pandas.core.frame.DataFrame

如果要保留第一个元素，可以定义函数并将其传递给aggfunc参数：

^{pr2}$

如果您不希望您的Fruit作为索引，您还可以使用reset_index：

In [147]: df.pivot_table(index='Fruit', aggfunc=lambda x: x.iloc[0]).reset_index()
Out[147]:
    Fruit  Amount    Name
0  Banana       6    Mary
1  Cherry      10  Sophie
2   Lemon       3    Jack

网友

3楼 · 编辑于 2024-04-20 01:19:44

如果您只需要一些行，可以使用^{}-^{}+reset_index的组合-它将保留每个组的第一行：

import pandas as pd

df = pd.DataFrame({'a': [1, 1, 2], 'b': [1, 2, 3]})
>>> df.groupby(df.a).first().reset_index()
    a   b
0   1   1
1   2   3

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用Pandas的groupby只是为了删除重复的项目

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >