有没有一种pythonic的方法来对两个数据帧进行交互并比较它们的行?

2024-05-16 19:31:15 发布

您现在位置:Python中文网/ 问答频道 /正文

给定以下两个数据帧:

df1#从excel电子表格读取

data1 = {'ID':['1','2'],
         'Prod Family Desc':['Install','Maintenance'], 'Prod Family Code':['',''], 
         'Prod Type Desc':['Installation Serice','Maintenance Service'],'Prod Type Code':['',''],
        } 
df1 = pd.DataFrame(data1) 
print(df1)

结果df1:

  ID Prod Family Desc Prod Family Code       Prod Type Desc Prod Type Code
0  1          Install                   Installation Serice
1  2      Maintenance                   Maintenance Service

df2#它是SQL查询的结果

data2 = {'Prod Class':['F','F','T','T'],
        'Prod Desc':['Install','Maintenance','Installation Serice','Maintenance Service'],'Prod Code':['2525','2534','H123','H321']
        }

df2 = pd.DataFrame(data2) 
print(df2)

结果df2:

  Prod Class            Prod Desc Prod Code
0          F              Install      2525
1          F          Maintenance      2534
2          T  Installation Serice      H123
3          T  Maintenance Service      H321

将df2中的产品类型代码分配到df1中的产品类型代码的最佳方法是什么?你知道吗

我这样做:

stype = df2.loc[df2['Prod Class'] == "T"] 

family = df2.loc[df2['Prod Class'] == "F"]

for i, concaterow in df1.iterrows():
    for j, styp in stype.iterrows():

        if (concaterow['Prod Type Desc'] == styp['Prod Desc']):
            df1.loc[i,'Prod Type Code'] = styp['Prod Code']

    for j, scat in family.iterrows():
        if (concaterow['Prod Family Desc'] == scat['Prod Desc']):
            df1.loc[i,'Prod Family Code'] = scat['Prod Code']

print(df1)

结果如预期:

  ID Prod Family Desc Prod Family Code       Prod Type Desc Prod Type Code
0  1          Install             2525  Installation Serice           H123
1  2      Maintenance             2534  Maintenance Service           H321

这种手术有什么好办法吗?你知道吗

#

**编辑@FatihAkici问题的答案。你知道吗

@FatihAkici-因为df2是SQL查询的结果,所以我期望的结果是表中插入的最新值。因此,假设df2如下:

data2 = {'Prod Class':['F','F','F','T','T'], 'Prod Desc':['Install','Maintenance','Install','Installation Serice','Maintenance Service'],'Prod Code':['2525','2534','2536','H123','H321'] } ```

The expected result would be: 
```ID Prod Family Desc Prod Family Code Prod Type Desc Prod Type Code 
 0 1  Install          2536             Installation Serice H123 
 1 2 Maintenance       2534             Maintenance Service H321 

Tags: installtypeserviceinstallationcodeprodmaintenancefamily
2条回答

您可以组合pd.DataFrame.assignpd.DataFrame.merge

df1.assign(**{
    "Prod Family Code" : df1.merge(df2, left_on = "Prod Family Desc", right_on = "Prod Desc")["Prod Code"],
    "Prod Type Code"   : df1.merge(df2, left_on = "Prod Type Desc", right_on = "Prod Desc")["Prod Code"]})

In your example your dataframe df1 contains 2 empty columns Prod Family Code and Prod Type Code which receive the results, but it is not a requirement for this method

我相信合并可以完成你想要的

df1.merge(df2, how='left', left_on=['Prod Family Desc'], right_on=['Prod Desc'])

相关问题 更多 >