如何从多个数据帧中比较和获取coulmn值发生次数

2024-03-29 02:15:19 发布

您现在位置:Python中文网/ 问答频道 /正文

是否可以基于2列比较4个数据帧,并获得包含重复的结果(如果出现在2个或更多数据帧中)。结果应包含发生次数。我的数据帧看起来像

>>>df1
  Circle Division Power 
0 AAAA   AA       25   
1 BBBB   BB       5     
>>>df2
  Circle Division Power 
0 CCCC   CC       25   
1 BBBB   BB       66
>>>df3
  Circle Division Power 
0 DDDD   DD       55   
1 FFFF   FF       68
2 AAAA   AA       87    
>>>df4
  Circle Division Power 
0 AAAA   AA       45   
1 CCCC   CC       56   

预期结果

>>>result_df
  Circle Division Power1 power2 power3 power4 Repeated
0 AAAA   AA       25     -      87     45     3
1 BBBB   BB       5      66     -      -      2
2 CCCC   CC       -      25     -      56     2 

我试着一个接一个地合并,但后来就卡住了

 m12=pd.merge(df1, df2, on=['Circle','Division'], how='inner',suffixes=('1',' 2'))
 m13=pd.merge(df1, df3, on=['Circle','Division'], how='inner',suffixes=('1',' 3'))
 m14=pd.merge(df1, df4, on=['Circle','Division'], how='inner',suffixes=('1',' 4'))
 m23=pd.merge(df2, df3, on=['Circle','Division'], how='inner',suffixes=('2',' 3'))
 m24=pd.merge(df2, df4, on=['Circle','Division'], how='inner',suffixes=('2',' 4'))
 m34=pd.merge(df3, df4, on=['Circle','Division'], how='inner',suffixes=('3',' 4'))

Tags: onmergehowaadivisionpdinnerdf1
1条回答
网友
1楼 · 发布于 2024-03-29 02:15:19

使用带有^{}^{}和参数keys将所有数据帧连接在一起,展平MultiIndex

^{}创建新列以获取每行非NaN的值,并按^{}进行筛选:

dfs = [df1, df2, df3, df4]

comp = [x.set_index(['Circle','Division']) for x in dfs]
df = pd.concat(comp, axis=1, keys=(range(1, len(dfs)+ 1)))
df.columns = [f'{b}{a}' for a, b in df.columns]
df['Repeat'] = df.count(axis=1)

df = df[df['Repeat'] > 1]
df = df.reset_index()
print (df)
  Circle Division  Power1  Power2  Power3  Power4  Repeat
0   AAAA       AA    25.0     NaN    87.0    45.0       3
1   BBBB       BB     5.0    66.0     NaN     NaN       2
2   CCCC       CC     NaN    25.0     NaN    56.0       2

相关问题 更多 >