合并两个数据帧并保留唯一的列

#df1 ----- location Ethnic Origins Percent(1) 0 Beaches-East York English 18.9 1 Davenport Portuguese 22.7 2 Eglinton-Lawrence Polish 12.0

#df2 ----- location lat lng 0 Beaches—East York, Old Toronto, Toronto, Golde... 43.681470 -79.306021 1 Davenport, Old Toronto, Toronto, Golden Horses... 43.671561 -79.448293 2 Eglinton—Lawrence, North York, Toronto, Golden... 43.719265 -79.429765

location Ethnic Origins Percent(1) lat lng 0 Beaches-East York English 18.9 43.681470 -79.306021 1 Davenport Portuguese 22.7 43.671561 -79.448293 2 Eglinton-Lawrence Polish 12.0 43.719265 -79.429765

3条回答

网友

1楼 · 编辑于 2024-05-14 23:42:24

正如其他人所指出的，问题是“location”列不共享任何值。一种解决方案是使用正则表达式除去从第一个逗号开始并延伸到字符串结尾的所有内容：

df2.location = df2.location.replace(r',.*', '', regex=True)

使用您提供的确切数据，这仍然不起作用，因为两个数据框中有不同类型的破折号。您可以用类似的方法解决这个问题（这次不需要正则表达式）：

df2.location = df2.location.replace('—', '-')

然后按照你的建议合并

df3 = pd.merge(df1, df2, on="location", how="left")

网友

2楼 · 编辑于 2024-05-14 23:42:24

我猜您遇到的问题是，您尝试合并的列不相同，即在df2.location中找不到相应的值来合并到df1。试着先改变这些，它应该会起作用：

df2["location"] = df2["location"].apply(lambda x: x.split(",")[0])
df3 = pd.merge(df1, df2, on="location", how="left")

网友

3楼 · 编辑于 2024-05-14 23:42:24

我们应该使用findall创建键

df2['location']=df2.location.str.findall('|'.join(df1.location)).str[0]
df3 = pd.merge(df1, df2, on="location", how="left")

相关问题更多 >

编程相关推荐

热门问题

热门文章