我有两个数据帧,它们是从两个独立的数据库中查询出来的,这两个数据库共享相同的特性,但并不总是相同的特性,我需要找到一种方法来可靠地将这两个数据帧连接在一起。你知道吗
例如:
import pandas as pd
inp = [{'Name':'Jose', 'Age':12,'Location':'Frankfurt','Occupation':'Student','Mothers Name':'Rosy'}, {'Name':'Katherine','Age':23,'Location':'Maui','Occupation':'Lawyer','Mothers Name':'Amy'}, {'Name':'Larry','Age':22,'Location':'Dallas','Occupation':'Nurse','Mothers Name':'Monica'}]
df = pd.DataFrame(inp)
print (df)
Age Location Mothers Name Name Occupation
0 12 Frankfurt Rosy Jose Student
1 23 Maui Amy Katherine Lawyer
2 22 Dallas Monica Larry Nurse
inp2 = [{'Name': '','Occupation':'Nurse','Favorite Hobby':'Basketball','Mothers Name':'Monica'},{'Name':'Jose','Occupation':'','Favorite Hobby':'Sewing','Mothers Name':'Rosy'},{'Name':'Katherine','Occupation':'Lawyer','Favorite Hobby':'Reading','Mothers Name':''}]
df2 = pd.DataFrame(inp2)
print(df2)
Favorite Hobby Mothers Name Name Occupation
0 Basketball Monica Nurse
1 Sewing Rosy Jose
2 Reading Katherine Lawyer
我需要找出一种方法来可靠地连接这两个数据帧,而不使数据始终保持一致。为了使问题进一步复杂化,这两个数据库的长度并不总是相同的。有什么想法吗?你知道吗
您可以对可能的列组合执行合并,并合并这些df,然后在第一个(完整的)df上合并新df:
这假设每一行的年龄和位置都是唯一的
相关问题 更多 >
编程相关推荐