两个矩阵的余弦相似性计算

2024-04-25 07:16:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个代码来计算两个矩阵之间的余弦相似性:

def cos_cdist_1(matrix, vector):
    v = vector.reshape(1, -1)
    return sp.distance.cdist(matrix, v, 'cosine').reshape(-1)


def cos_cdist_2(matrix1, matrix2):
    return sp.distance.cdist(matrix1, matrix2, 'cosine').reshape(-1)

list1 = [[1,1,1],[1,2,1]]
list2 = [[1,1,1],[1,2,1]]

matrix1 = np.asarray(list1)
matrix2 = np.asarray(list2)

results = []
for vector in matrix2:
    distance = cos_cdist_1(matrix1,vector)
    distance = np.asarray(distance)
    similarity = (1-distance).tolist()
    results.append(similarity)


dist_all = cos_cdist_2(matrix1, matrix2)
results2 = []
for item in dist_all:
    distance_result = np.asarray(item)
    similarity_result = (1-distance_result).tolist()
    results2.append(similarity_result)

results

[[1.0000000000000002, 0.9428090415820635],
                     [0.9428090415820635, 1.0000000000000002]]

然而,results2[1.0000000000000002, 0.9428090415820635, 0.9428090415820635, 1.0000000000000002]

我理想的结果是results,这意味着结果包含了相似值列表,但是我想保留两个矩阵之间的计算,而不是向量和矩阵,有什么好主意吗?


Tags: defnp矩阵cosresultresultsdistancevector
1条回答
网友
1楼 · 发布于 2024-04-25 07:16:52
In [75]: import scipy.spatial as sp
In [76]: 1 - sp.distance.cdist(matrix1, matrix2, 'cosine')
Out[76]: 
array([[ 1.        ,  0.94280904],
       [ 0.94280904,  1.        ]])

因此,您可以消除for-loops,并将其全部替换为

results2 = 1 - sp.distance.cdist(matrix1, matrix2, 'cosine')

相关问题 更多 >