在python中通过子字符串匹配两个数据帧

Id Title Keyword 1 The house of pump house 2 Where is Andijan andijan 3 The Joker joker 4 Good bars in Andijan andijan 5 What a beautiful house house

2条回答

网友

1楼 · 编辑于 2024-05-28 20:49:25

让我们试试findall

import re
df1['new'] = df1.Title.str.findall('|'.join(df2.Keyword.tolist()),flags= re.IGNORECASE).str[0]
df1
   Id                   Title      new
0   1       The house of pump    house
1   2        Where is Andijan  Andijan
2   3               The Joker    Joker
3   4    Good bars in Andijan  Andijan
4   5  What a beautiful house    house

网友

2楼 · 编辑于 2024-05-28 20:49:25

进一步开发@BENY的解决方案，以便能够获得每个标题的多个关键字：

regex = '|'.join(keywords['Keyword'])
keywords = df['Title'].str.findall(regex, flags=re.IGNORECASE)
keywords_exploded = pd.DataFrame(keywords.explode().dropna())
df.merge(keywords_exploded, left_index=True, right_index=True)

编程相关推荐

java在panelgrid中填充包含行和列的列表
Java中swing JList作为选项卡
java Zookeeper与spotify kafka图像的cprestproxy连接失败
java maven 3.0+跳过“mvn部署”上的插件执行
多线程在Java中，如何在同一网络上的服务器和客户端之间创建同步连接？
java Hibernate JPA“未找到关联类”，使用另一个实体作为键，可嵌入作为值
java在javamail api中用于会话的库
java活动生命周期回调混乱？
产生乱码输出的java if语句
java我无法从RESTAPI URL获取JSON对象

相关问题更多 >

编程相关推荐

热门问题

热门文章

在python中通过子字符串匹配两个数据帧

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >