基于部分字符串匹配，从另一个数据帧填充一个数据帧列问题的回答

基于部分字符串匹配，从另一个数据帧填充一个数据帧列

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

初始化提供的数据帧： <pre><code>import numpy as np import pandas as pd df1 = pd.DataFrame([['PT_WOA', '.ZS01_LA120_T05.SB.S2384_LesSwL', 10], ['PT_WOA', '.ZS01_RB2202_T05.SB.S2385_FLOK', 10], ['PT_WOA', '.ZS01_LA120_T05.SB._CBAbsHy', 10], ['PT_WOA', '.ZS01_LA120_T05.SB.S3110_CBAPV', 10], ['PT_WOA', '.ZS01_LARB2204.SB.S3111_CBRelHy', 10]], columns = ['Line', 'TagName', 'CLASS'], index = [187877, 187878, 187879, 187880, 187881]) df2 = pd.DataFrame([[1311256, 'Lifting table', 'LA120'], [1311257, 'Roller bed', 'RB2200'], [1311258, 'Lifting table', 'LT2202'], [1311259, 'Roller bed', 'RB2202'], [1311260, 'Roller bed', 'RB2204']], columns = ['EquipmentNo', 'EquipmentDescription', 'Equipment']) </code></pre> 我建议如下： <pre><code># create a copy of df1, dropping the 'CLASS' column df3 = df1.drop(columns=['CLASS']) # add the columns 'EquipmentDescription' and 'Equipment' filled with numpy NaN's df3['EquipmentDescription'] = np.nan df3['EquipmentNo'] = np.nan # for each row in df3, iterate over each row in df2 for index_df3, row_df3 in df3.iterrows(): for index_df2, row_df2 in df2.iterrows(): # check if 'Equipment' is in 'TagName' if df2.loc[index_df2, 'Equipment'] in df3.loc[index_df3, 'TagName']: # set 'EquipmentDescription' and 'EquipmentNo' df3.loc[index_df3, 'EquipmentDescription'] = df2.loc[index_df2, 'EquipmentDescription'] df3.loc[index_df3, 'EquipmentNo'] = df2.loc[index_df2, 'EquipmentNo'] # conver the 'EquipmentNo' to type int df3['EquipmentNo'] = df3['EquipmentNo'].astype(int) </code></pre> 这将产生以下数据帧： <pre><code> Line TagName EquipmentDescription EquipmentNo 187877 PT_WOA .ZS01_LA120_T05.SB.S2384_LesSwL Lifting table 1311256 187878 PT_WOA .ZS01_RB2202_T05.SB.S2385_FLOK Roller bed 1311259 187879 PT_WOA .ZS01_LA120_T05.SB._CBAbsHy Lifting table 1311256 187880 PT_WOA .ZS01_LA120_T05.SB.S3110_CBAPV Lifting table 1311256 187881 PT_WOA .ZS01_LARB2204.SB.S3111_CBRelHy Roller bed 1311260 </code></pre> 让我知道这是否有帮助

基于部分字符串匹配，从另一个数据帧填充一个数据帧列

1 个回答

相关Python问题