用python计算numpy的欧氏距离

3条回答

网友

1楼 · 编辑于 2024-06-16 09:45:18

我想你想做的是：你说你想要一个20乘20的矩阵。。。但是你编码的是三角形的。

因此我编码了一个完整的20x20矩阵。

distances = []
for i in range(len(ncoord)):
    given_i = []
    for j in range(len(ncoord)):
        d_val = math.sqrt((ncoord[i, 0]-ncoord[j,0])**2+(ncoord[i,1]-ncoord[j,1])**2)
        given_i.append(d_val)

    distances.append(given_i)

    # distances[i][j] = distance from i to j

弯道：

from scipy.spatial.distance import cdist
# Isn't scipy nice - can also use pdist... works in the same way but different recall method.
distances = cdist(ncoord, ncoord, 'euclidean')

网友

2楼 · 编辑于 2024-06-16 09:45:18

for i in range(0, n):
    for j in range(i+1, n):
        c[i, j] = math.sqrt((ncoord[i, 0] - ncoord[j, 0])**2 
        + (ncoord[i, 1] - ncoord[j, 1])**2)

注意：ncoord[i, j]与ncoord[i][j]对于Numpy矩阵不同。这似乎是混乱的根源。如果ncoord是一个Numpy数组，那么它们将给出相同的结果。

对于Numpymatrix，ncoord[i]返回ncoord的第行的，它本身是一个Numpymatrix对象，在您的例子中是1 x 2形状。因此，ncoord[i][j]实际上是指：取ncoord的第i行和取1×2矩阵的第j行。这就是索引问题在j>；0时出现的地方。

关于您对分配给c[i][j]“工作”的评论，应该不会。至少在我的Numpy 1.9.1版本中，如果您的索引i和j迭代到n时，应该不会工作。

作为旁白，请记住将矩阵c的转置相加。

建议使用Numpy数组而不是矩阵。见this post。

如果坐标存储为Numpy数组，则成对距离可以计算为：

from scipy.spatial.distance import pdist

pairwise_distances = pdist(ncoord, metric="euclidean", p=2)

或者只是

pairwise_distances = pdist(ncoord)

因为默认度量是“euclidean”，而默认度量“p”是2。

在下面的评论中，我错误地提到pdist的结果是n x n矩阵。要得到n x n矩阵，需要执行以下操作：

from scipy.spatial.distance import pdist, squareform

pairwise_distances = squareform(pdist(ncoord))

或者

from scipy.spatial.distance import cdist

pairwise_distances = cdist(ncoord, ncoord)

网友
3楼 · 编辑于 2024-06-16 09:45:18

使用嵌套的for循环进行此操作的替代方法要快得多。我将向您展示两种不同的方法-第一种方法将是更通用的方法，它将向您介绍广播和矢量化，第二种方法使用更方便的scipy库函数。

一。一般方法，使用广播和矢量化

我建议首先使用np.array，而不是np.matrix。数组是a number of reasons的首选，最重要的是因为它们可以具有>；2维，而且它们使按元素进行乘法变得不那么困难。

import numpy as np

ncoord = np.array(ncoord)

使用数组，我们可以通过插入一个新的单例维度和broadcasting上的减法来消除嵌套的for循环：

# indexing with None (or np.newaxis) inserts a new dimension of size 1
print(ncoord[:, :, None].shape)
# (20, 2, 1)

# by making the 'inner' dimensions equal to 1, i.e. (20, 2, 1) - (1, 2, 20),
# the subtraction is 'broadcast' over every pair of rows in ncoord
xydiff = ncoord[:, :, None] - ncoord[:, :, None].T

print(xydiff.shape)
# (20, 2, 20)

这相当于使用嵌套for循环在每对行上循环，但要快得多！

xydiff2 = np.zeros((20, 2, 20), dtype=xydiff.dtype)
for ii in range(20):
    for jj in range(20):
        for kk in range(2):
            xydiff[ii, kk, jj] = ncoords[ii, kk] - ncoords[jj, kk]

# check that these give the same result
print(np.all(xydiff == xydiff2))
# True

剩下的我们也可以使用矢量化操作：

# we square the differences and sum over the 'middle' axis, equivalent to
# computing (x_i - x_j) ** 2 + (y_i - y_j) ** 2
ssdiff = (xydiff * xydiff).sum(1)

# finally we take the square root
D = np.sqrt(ssdiff)

整件事可以这样一字排开：

D = np.sqrt(((ncoord[:, :, None] - ncoord[:, :, None].T) ** 2).sum(1))

2。懒惰的方式，使用`pdist`

原来已经有了一个快速方便的函数来计算所有的成对距离：^{}。

from scipy.spatial.distance import pdist, squareform

d = pdist(ncoord)

# pdist just returns the upper triangle of the pairwise distance matrix. to get
# the whole (20, 20) array we can use squareform:

print(d.shape)
# (190,)

D2 = squareform(d)
print(D2.shape)
# (20, 20)

# check that the two methods are equivalent
print np.all(D == D2)
# True

一。一般方法，使用广播和矢量化

2。懒惰的方式，使用`pdist`

相关问题更多 >

编程相关推荐

热门问题

热门文章