Python/Pandas用另一个datafram中的值替换一个datafram中的元素

userId itemId interaction 1 1 1 1 2 1 2 0 3 1 3 1 4 1 4 1 5 2 9 1 6 3 3 1 7 3 5 0

userId itemId interaction 1 1 1 NaN 2 1 2 NaN 3 1 3 NaN 4 1 4 NaN 5 2 9 1 6 3 3 1 7 3 5 0

userId itemId interaction 1 1 1 NaN 2 1 2 NaN 3 1 3 1 4 1 4 1 5 2 9 1 6 3 3 1 7 3 5 0

for item in ranked_items: if new_user_item_matrix.loc[new_user_item_matrix['userId']==cold_user].loc[new_user_item_matrix['itemId']==item].empty: pass else: new_user_item_matrix.replace(to_replace=new_user_item_matrix.loc[new_user_item_matrix['userId']==1].loc[new_user_item_matrix['itemId']==item].iloc[0,2],value=cold_user_item_matrixloc[cold_user_item_matrix['itemId']==item].iloc[0,2],inplace=True) new_user_item_matrix.dropna(axis=0,how='any',inplace=True)

userId itemId interaction 1 1 1 1 2 1 2 1 3 1 3 1 4 1 4 1 5 2 9 1 6 3 3 1 7 3 5 0

1条回答

网友

1楼 · 发布于 2024-05-13 22:46:39

短解

本质上，您希望放弃给定用户的所有项目交互，但仅限于那些排名而非的项目。在

为了使建议的解决方案更具可读性，假设df = initial_user_item_matrix。在

具有布尔条件的简单行选择（在原始df上生成只读视图）：

filtered_df = df[(df.userID != 1) | df.itemID.isin(ranked_items)]

类似的解决方案通过删除“无效”行来修改数据帧：

^{pr2}$

使用所有中间结构的逐步解决方案

假设上述所有中间产物都是必需的，则可以获得如下期望的结果：

import pandas as pd
import numpy as np

initial_user_item_matrix = pd.DataFrame([[1, 1, 1], 
                                        [1, 2, 0], 
                                        [1, 3, 1], 
                                        [1, 4, 1], 
                                        [2, 9, 1], 
                                        [3, 3, 1], 
                                        [3, 5, 0]],
                                        columns=['userID', 'itemID', 'interaction'])
print("initial_user_item_matrix\n{}\n".format(initial_user_item_matrix))

ranked_items = np.array([9, 5, 3, 4]) 

cold_user = 1 

cold_user_item_matrix = initial_user_item_matrix.loc[initial_user_item_matrix.userID == cold_user]
print("cold_user_item_matrix\n{}\n".format(cold_user_item_matrix))

new_user_item_matrix = initial_user_item_matrix.copy()
new_user_item_matrix.ix[new_user_item_matrix.userID == cold_user, 'interaction'] = np.NaN
print("new_user_item_matrix\n{}\n".format(new_user_item_matrix))

new_user_item_matrix.ix[new_user_item_matrix.userID == cold_user, 'interaction'] = cold_user_item_matrix.apply(lambda r: r.interaction if r.itemID in ranked_items else np.NaN, axis=1)
print("new_user_item_matrix after replacing\n{}\n".format(new_user_item_matrix))

new_user_item_matrix.dropna(inplace=True)
print("new_user_item_matrix after dropping nans\n{}\n".format(new_user_item_matrix))

生产

initial_user_item_matrix
   userID  itemID  interaction
0       1       1            1
1       1       2            0
2       1       3            1
3       1       4            1
4       2       9            1
5       3       3            1
6       3       5            0

cold_user_item_matrix
   userID  itemID  interaction
0       1       1            1
1       1       2            0
2       1       3            1
3       1       4            1

new_user_item_matrix
   userID  itemID  interaction
0       1       1          NaN
1       1       2          NaN
2       1       3          NaN
3       1       4          NaN
4       2       9            1
5       3       3            1
6       3       5            0

new_user_item_matrix after replacing
   userID  itemID  interaction
0       1       1          NaN
1       1       2          NaN
2       1       3            1
3       1       4            1
4       2       9            1
5       3       3            1
6       3       5            0

new_user_item_matrix after dropping nans
   userID  itemID  interaction
2       1       3            1
3       1       4            1
4       2       9            1
5       3       3            1
6       3       5            0

相关问题更多 >

编程相关推荐

热门问题

热门文章