仅显示给定id的数据帧中值的唯一实例的Pandas

2024-03-28 16:03:27 发布

您现在位置:Python中文网/ 问答频道 /正文

这是我正在使用的数据帧

df = pd.DataFrame({'id' : ['45', '45', '45', '45', '46', '46'],
                  'description' : ['credit score too low', 'credit score too low', 'credit score too low', 'high risk of fraud', 'address not verified', 'address not verified']})
print(df)

我正在尝试修改数据帧,以便对于给定的id,没有重复的描述。下面的数据帧是所需的输出

newdf = pd.DataFrame({'id' : ['45', '45', '46'],
                  'description' : ['credit score too low', 'high risk of fraud', 'address not verified']})
print(newdf)

Tags: 数据iddataframedfaddressnotdescriptionlow
1条回答
网友
1楼 · 发布于 2024-03-28 16:03:27

您可以使用^{} [pandas-doc]删除重复项。例如:

>>> df
   id           description
0  45  credit score too low
1  45  credit score too low
2  45  credit score too low
3  45    high risk of fraud
4  46  address not verified
5  46  address not verified
>>> df.drop_duplicates()
   id           description
0  45  credit score too low
3  45    high risk of fraud
4  46  address not verified

因此,您可以将df设置为新的数据帧,如下所示:

df = df.drop_duplicates()

相关问题 更多 >