计算numpy阵列的所有子阵列之间的相对距离

2024-04-26 04:33:10 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数组:

test_arr = np.array([ [1.2, 2.1, 2.3, 4.5],
                      [2.6, 6.4, 5.2, 6.2],
                      [7.2, 6.2, 2.5, 1.7],
                      [8.2, 7.6, 4.2, 7.3] ]

是否可以获得以下格式的数据帧:

row_id  | row1  | row2          | row3          | row4
row1      0.0     d(row1,row2)    d(row1,row3)    d(row1,row4)
row2      ...     0.0             ...             ...
row3      ...        ...          0.0             ...
row4      ...        ...          0.0             ...

其中d(row1, row2)row1和行2之间的欧几里德距离

我现在尝试的是首先生成所有行对的列表,然后计算距离并将每个元素分配给数据帧。有没有更好/更快的方法


Tags: 数据testid距离列表格式np数组
3条回答
from sklearn.metrics.pairwise import euclidean_distances
pd.DataFrame(euclidean_distances(test_arr, test_arr))

          0         1         2         3
0  0.000000  5.634714  7.790379  9.523655
1  5.634714  0.000000  6.981404  5.916925
2  7.790379  6.981404  0.000000  6.100000
3  9.523655  5.916925  6.100000  0.000000

使用cdist计算成对距离

将二维结果数组放置到数据帧中

import numpy as np
from scipy.spatial.distance import cdist
import pandas as pd

test_arr = np.array([ [1.2, 2.1, 2.3, 4.5],
                      [2.6, 6.4, 5.2, 6.2],
                      [7.2, 6.2, 2.5, 1.7],
                      [8.2, 7.6, 4.2, 7.3] ])

    # Use cdist to compute pairwise distances
    dist = cdist(test_arr, test_arr)

    # Place into Pandas DataFrame
    # index and names of columns
    names = ['row' + str(i) for i in range(1, dist.shape[0]+1)]
    df = pd.DataFrame(dist, columns = names, index = names)

    print(df)

输出

熊猫数据帧

        row1      row2      row3      row4
row1  0.000000  5.634714  7.790379  9.523655
row2  5.634714  0.000000  6.981404  5.916925
row3  7.790379  6.981404  0.000000  6.100000
row4  9.523655  5.916925  6.100000  0.000000
from scipy import spatial
import numpy as np

test_arr = np.array([ [1.2, 2.1, 2.3, 4.5],
                      [2.6, 6.4, 5.2, 6.2],
                      [7.2, 6.2, 2.5, 1.7],
                      [8.2, 7.6, 4.2, 7.3] ])

dist = spatial.distance.pdist(test_arr)
spatial.distance.squareform(dist)

结果:

array([[0.        , 5.63471383, 7.79037868, 9.52365476],
       [5.63471383, 0.        , 6.98140387, 5.91692488],
       [7.79037868, 6.98140387, 0.        , 6.1       ],
       [9.52365476, 5.91692488, 6.1       , 0.        ]])

相关问题 更多 >

    热门问题