使用重复的值来增加列

2024-04-29 13:34:10 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据帧,我想根据重复值的数量增加一列。因此,当发现重复项时,所有其他出现的项都会递增。所以给定这个输入数据帧

    SM
 0  AB
 1  AC
 2  AD
 3  AB
 4  AB
 5  AC
 6  AE
 7  AD

返回

     SM DM
  0  AB AB
  1  AC AC
  2  AD AD
  3  AB AB_1
  4  AB AB_2
  5  AC AC_1
  6  AE AE
  7  AD AD_1

我试过这行代码,但不知道如何递增

 np.where(a.SM.duplicated(keep='first'), a.SM+'_1', a.SM)

Tags: 数据代码数量abnpdmwheread
2条回答

使用^{}^{}

s = df.groupby('SM').cumcount()

df['DM'] = df['SM'].where(s.eq(0), df['SM'] + '_' + s.astype(str))

[外]

   SM    DM
0  AB    AB
1  AC    AC
2  AD    AD
3  AB  AB_1
4  AB  AB_2
5  AC  AC_1
6  AE    AE
7  AD  AD_1

dplyrpaste()中按组创建计数器-1可以为您提供所需的结果:

library(dplyr)
library(tidyr)
# Getting those whose value which Are repeated
df$BoolDup<-duplicated(df$SM)
# Creating counting variable and a second counter that keeps track of the repetitions-1 to then Join if duplicated
df %>% mutate(count = 1) %>% 
  group_by(SM)%>%
  mutate(count2 = cumsum(count)-1) %>%
  mutate(DM = ifelse(BoolDup==TRUE,paste(SM,"_",count2,sep =""), SM))%>%
  dplyr::select(SM=SM, DM=DM)

# A tibble: 8 x 2
# Groups:   SM [4]
# SM    DM   
# <chr> <chr>
# 1 AB    AB   
# 2 AC    AC   
# 3 AD    AD   
# 4 AB    AB_1 
# 5 AB    AB_2 
# 6 AC    AC_1 
# 7 AE    AE   
# 8 AD    AD_1 

相关问题 更多 >