我有两个数据帧,需要有条件地更新第一个数据帧中的特定列。你知道吗
df1 = pd.DataFrame([[1,'Foo',1,1,1,np.nan,np.nan,np.nan],[2,'Foo',2,2,2,np.nan,np.nan,np.nan],[3,'Bar',3,3,3,np.nan,np.nan,np.nan]], columns = ['Key','identifier','A','B','C','D','E','F'])
print df1
Key identifier A B C D E F
0 1 Foo 1 1 1 NaN NaN NaN
1 2 Foo 2 2 2 NaN NaN NaN
2 3 Bar 3 3 3 NaN NaN NaN
df2 = pd.DataFrame([[1,np.nan,10,10,10,5,6,7],[2,np.nan,12,12,12,8,9,10],[3,np.nan,13,13,13,11,12,13]], columns = ['Key','identifier','A','B','C','D','E','F'])
print df2
Key identifier A B C D E F
0 1 NaN 10 10 10 5 6 7
1 2 NaN 12 12 12 8 9 10
2 3 NaN 13 13 13 11 12 13
如果df1中的identifer列=='Foo',我需要用df2中相应的列更新df1列D、E、F。如何有条件地更新这三列?你知道吗
df3 = #code here
期望输出:
print df3
Key identifier A B C D E F
0 1 Foo 1 1 1 5.0 6.0 7.0
1 2 Foo 2 2 2 8.0 9.0 10.0
2 3 Bar 3 3 3 NaN NaN NaN
跟进
换言之,df1如下所示:
df1 = pd.DataFrame([[1,'Foo',1,1,1,np.nan,np.nan,np.nan],[4,'Bar',4,4,4,np.nan,np.nan,np.nan],[2,'Foo',2,2,2,np.nan,np.nan,np.nan],[3,'Bar',3,3,3,np.nan,np.nan,np.nan]], columns = ['Key','identifier','A','B','C','D','E','F'])
现在df1和df2的长度不一样,要更新的记录的位置也不匹配。怎么还能用?我得到以下输出:
df2[df1['identifier'] == 'Foo'].combine_first(df1)
Key identifier A B C D E F
0 1.0 Foo 10.0 10.0 10.0 5.0 6.0 7.0
1 4.0 Bar 4.0 4.0 4.0 NaN NaN NaN
2 3.0 Foo 13.0 13.0 13.0 11.0 12.0 13.0
3 3.0 Bar 3.0 3.0 3.0 NaN NaN NaN
在用
set_index
将Key
设置到索引之后,使用combine_first
。你知道吗相关问题 更多 >
编程相关推荐