如何将两个不同数据帧中相同位置的列相乘？

df1 = {'a': {0: 0, 1: 0, 2: 0, 3: 0, 4: 1}, 'b': {0: 1, 1: 0, 2: 1, 3: 0, 4: 0}, 'c': {0: 0, 1: 0, 2: 0, 3: 0, 4: 1}, 'd': {0: 0, 1: 1, 2: 1, 3: 0, 4: 0}, 'e': {0: 0, 1: 1, 2: 0, 3: 1, 4: 0}, 'f': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0}, 'g': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0}, 'h': {0: 1, 1: 0, 2: 1, 3: 1, 4: 0}, 'i: {0: 0, 1: 1, 2: 0, 3: 1, 4: 0}, 'j': {0: 1, 1: 0, 2: 0, 3: 0, 4: 1}}

df2 = {'a_top3': {0: 0.084973365, 1: 0.057013709, 2: 0.072325557, 3: 0.098824218, 4: 0.252425998}, 'b_top3': {0: 0.168823063, 1: 0.044829924, 2: 0.178180799, 3: 0.032501712, 4: 0.054869764}, 'c_top3': {0: 0.040331405, 1: 0.042758454, 2: 0.077851109, 3: 0.111247674, 4: 0.160724968}, 'd_top3': {0: 0.11076121, 1: 0.156901404, 2: 0.111759722, 3: 0.031440482, 4: 0.046660293}, 'e_top3': {0: 0.059534989, 1: 0.090733215, 2: 0.087737411, 3: 0.141953781, 4: 0.011520214}, 'f_top3': {0: 0.067696713, 1: 0.081674345, 2: 0.034215827, 3: 0.075849444, 4: 0.011245198}, 'g_top3': {0: 0.041895844, 1: 0.048191357, 2: 0.102012217, 3: 0.100579783, 4: 0.034403443}, 'h_top3': {0: 0.124932915, 1: 0.085968919, 2: 0.220041335, 3: 0.155145347, 4: 0.032171372}, 'i_top3': {0: 0.103714436, 1: 0.349804282, 2: 0.077229746, 3: 0.150859997, 4: 0.081321001}, 'j_top3': {0: 0.197336018, 1: 0.042124409, 2: 0.038646296, 3: 0.101597518, 4: 0.314657748}}

2条回答

网友

1楼 · 编辑于 2024-05-17 01:11:31

首先使用merge()方法：

result=df1[['a','b','c']].merge(df2[['a_top3', 'b_top3', 'c_top3']],left_index=True,right_index=True)

最后利用apply()方法和anonymous function：

result=result.apply(lambda x:x['a']*x['a_top3']+x['b']*x['b_top3']+x['c']*x['c_top3'],axis=1)

现在，如果您打印result，您将获得：

0    0.168823
1    0.000000
2    0.178181
3    0.000000
4    0.413151
dtype: float64

由于序列包含float类型的数据，因此无法用0代替0.000000

网友

2楼 · 编辑于 2024-05-17 01:11:31

让我们把你的数据的子集（DF1和DF2的前三列）：

In [362]: temp1 = df1.loc[:, ['a','b','c']]
     ...: temp2 = df2.iloc[:, :3]

In [363]: temp1
Out[363]: 
   a  b  c
0  0  1  0
1  0  0  0
2  0  1  0
3  0  0  0
4  1  0  1

In [364]: temp2
Out[364]: 
     a_top3    b_top3    c_top3
0  0.084973  0.168823  0.040331
1  0.057014  0.044830  0.042758
2  0.072326  0.178181  0.077851
3  0.098824  0.032502  0.111248
4  0.252426  0.054870  0.160725

相乘（或任何类似操作）时，Pandas将尝试对齐索引和列。在这个场景中，我们需要找到一种方法将列名从temp1（a，b，c）对齐到temp2（a_top3，…）。在这种情况下，最简单的解决方案是删除top3的temp2后缀，然后Pandas将成功地将列相乘并返回所需的内容：

In [367]: temp1.mul(temp2.rename(columns = lambda x: x[0])).sum(1)
Out[367]: 
0    0.168823
1    0.000000
2    0.178181
3    0.000000
4    0.413151
dtype: float64

将同样的想法扩展到{}和{}：

In [368]: df1.mul(df2.rename(columns = lambda x: x[0])).sum(1)
Out[368]: 
0    0.491092
1    0.597439
2    0.509982
3    0.447959
4    0.727809
dtype: float64

相关问题更多 >

编程相关推荐

热门问题

热门文章