字符串匹配并获取多个列

Item_ID Item_Name Material_Name 1 Paper Bag Office 2 wallpaper Decor 3 paper Office 4 cat cage Animal Misc 5 good cat Animal

Id Name Item_ID Material_Name 1 Paper 1,2,3 Office,Decor,Office 2 Paper Bag 1,2,3 Office,Decor,Office 3 Scissors NA NA 4 Mat NA NA 5 Cat 4,5 Animal Misc, Animal 6 Good Cat 4,5 Animal Misc,Animal

def matcher(query): matches = [i['Item_ID'] for i in df2[['Item_ID','Name']].to_dict('records') if any(q in i['Name'].lower() for q in query.lower().split())] if matches: return ','.join(map(str, matches)) else: return 'NA' df1['Item_ID'] = df1['Name'].apply(matcher)

2条回答

网友
1楼 · 编辑于 2024-06-16 12:34:56

下面是使用^{}和重用布尔掩码的一种方法：
def matcher(x): # construct 2-way mask m1 = df2['Item_Name'].str.contains(x, regex=False, case=False) m2 = [any(w in i.lower() for w in x.lower().split()) for i in df2['Item_Name']] # apply 2-way mask res_id = df2.loc[m1 | m2, 'Item_ID'] res_mat = df2.loc[m1 | m2, 'Material_Name'] return ','.join(res_id.astype(str)), ','.join(res_mat.astype(str)) df1[['Item_ID', 'Material_Name']] = pd.DataFrame(df1['Name'].apply(matcher).tolist()) print(df1) Id Name Item_ID Material_Name 0 1 Paper 1,2,3 Office,Decor,Office 1 2 Paper Bag 1,2,3 Office,Decor,Office 2 3 Scissors 3 4 Mat 4 5 Cat 4,5 Animal Misc,Animal 5 6 Good Cat 4,5 Animal Misc,Animal

网友
2楼 · 编辑于 2024-06-16 12:34:56

您可以尝试从查询中获取Item_ID和Material_Name作为元组，然后使用[i[0] for i in matches]或[i[1] for i in matches]应用适当的列。你知道吗
def matcher(query): matches = [(i['Item_ID'], i['Material_Name']) for i in df2[['Item_ID','Name']].to_dict('records') if any(q in i['Name'].lower() for q in df1['Name'].lower().split())] if matches: df1['Material_Name'].apply(','.join(map(str, [i[1] for i in matches]))) return ','.join(map(str, [i[0] for i in matches])) else: df1['Material_Name'].apply("NA") return 'NA' df1['Item_ID'] = df1['Name'].apply(matcher)

相关问题更多 >

编程相关推荐

热门问题

热门文章