Pandas为分组d中的每个组分配唯一的ID

unique_combination = 1 #acts as a counter df['unique_combination'] = 0 for idx, row in df.iterrows(): if len(df.query('A == @row.A & B == @row.B & C == @row.C')) > 1: # check, if one occurrence of the combination already has a value > 0??? df.loc[idx, 'unique_combination'] = unique_combination unique_combination += 1

2条回答

网友

1楼 · 编辑于 2024-04-26 14:36:50

Pandas版本0.20.2中添加的一个新功能会自动为您创建一列唯一的id。在

df['unique_id'] = df.groupby(['A', 'B', 'C']).ngroup()

给出以下输出

^{pr2}$

根据迭代的顺序给这些组指定id。在

请参阅此处的文档：https://pandas.pydata.org/pandas-docs/stable/user_guide/groupby.html#enumerate-groups

网友

2楼 · 编辑于 2024-04-26 14:36:50

步骤1:使用值0指定新列

df['new'] = 0

步骤2：制作一个重复次数超过1的面具，即

^{pr2}$

步骤3：根据掩码指定因子分解值，即

df.loc[mask,'new'] = df.loc[mask,['A','B','C']].astype(str).sum(1).factorize()[0] + 1

# or
# df.loc[mask,'new'] = df.loc[mask,['A','B','C']].groupby(['A','B','C']).ngroup()+1

输出：

   A  B  C  new
0  2  1  1    0
1  1  1  1    1
2  1  1  1    1
3  2  2  2    0
4  1  2  2    2
5  1  2  1    3
6  1  2  2    2
7  1  2  1    3
8  1  2  2    2
9  2  2  1    0

相关问题更多 >

编程相关推荐

热门问题

热门文章