连接表中列的值

2024-06-06 08:39:47 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下数据框:

    Zone    Store   Department  TTLSales    
0   APV                         220 
1   APV     ST12                100 
2   APV     ST12    Elec        40  
3   APV     ST12    Grocery     20  
4   APV     ST12    CPG         40 

我希望包括一列,将值连接为:

    Zone    Store   Department  TTLSales    id
0   APV                         220         APV
1   APV     ST12                100         APV.ST12
2   APV     ST12    Elec        40          APV.ST12.Elec
3   APV     ST12    Grocery     20          APV.ST12.Grocery
4   APV     ST12    CPG         40          APV.ST12.CPG

我对熊猫还不熟悉,花了很多时间,但我还是无法控制自己


Tags: 数据storeidzone时间departmentcpg将值
3条回答

尝试:

#Firstly fill NaN's of the columns:
df[['Zone','Store','Department']]=df[['Zone','Store','Department']].fillna('')
#Finally:
df['id']=(df['Zone']+'.'+df['Store']+'.'+df['Department']).str.rstrip('.')

如果有超过4列,则使用apply()(从性能角度来看,第一种方法比应用方法快):

#Firstly fill NaN's of the columns:
df[['Zone','Store','Department']]=df[['Zone','Store','Department']].fillna('')
#Finally:
df['id'] = df[['Zone','Store','Department']].apply('.'.join, axis=1).str.rstrip('.')

可能工作过度,但这里有另一种使用reduce解决此问题的方法:

from functools import reduce

cols = ['Zone','Store','Department']
f = lambda x,y : (x +'.'+y).str.rstrip(".")
#or# f = lambda x,y : x.str.cat(y,sep='.').str.rstrip(".")

df['id'] = reduce(f,map(df.fillna('').get, cols))

print(df)

  Zone Store Department  TTLSales                id
0  APV   NaN        NaN       220               APV
1  APV  ST12        NaN       100          APV.ST12
2  APV  ST12       Elec        40     APV.ST12.Elec
3  APV  ST12    Grocery        20  APV.ST12.Grocery
4  APV  ST12        CPG        40      APV.ST12.CPG

您可以在此处将df.aggstr.join一起使用

df = df.fillna('')
df['id'] = df[['Zone','Store','Department']].agg('.'.join, axis=1)

相关问题 更多 >