地理坐标间距离矩阵

2024-06-17 08:01:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个有600多个地理坐标点的数据框熊猫。他的摘录如下:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from math import sin, cos, sqrt, atan2, radians

lat_long = pd.DataFrame({'LATITUDE':[-22.98, -22.97, -22.92, -22.87, -22.89], 'LONGITUDE': [-43.19, -43.39, -43.24, -43.28, -43.67]})
lat_long

要手动计算两点之间的距离,我使用以下代码:

lat1 = radians(lat_long['LATITUDE'][0])
lon1 = radians(lat_long['LONGITUDE'][0])
lat2 = radians(lat_long['LATITUDE'][1])
lon2 = radians(lat_long['LONGITUDE'][1])

R = 6373.0

dlon = lon2 - lon1
dlat = lat2 - lat1

a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlon / 2)**2
c = 2 * atan2(sqrt(a), sqrt(1 - a))

distance = R * c

print("Result:", round(distance,4))

我需要做的是创建一个函数,使用上面的公式来计算从所有点到所有点的距离,就像在一个数组中一样。但是我很难思考要做什么函数,以及存储点之间的距离。欢迎任何帮助。输出示例(仅用于说明目的,如果我不清楚的话):

|       |point 0 | point1 | point2 |
|point0 |    0   |    2   |   3    |
|point1 |    2   |    0   |   4    |
|point2 |    3   |    4   |   0    |
        |distance|distance|distance|

Tags: import距离assqrtsincoslongdistance
2条回答

另一个可能的解决办法是

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from math import sin, cos, sqrt, atan2, radians

lat_long = pd.DataFrame({'LATITUDE':[-22.98, -22.97, -22.92, -22.87, -22.89], 'LONGITUDE': [-43.19, -43.39, -43.24, -43.28, -43.67]})
lat_long

test = lat_long.iloc[2:,:]

def distance(city1, city2):
    lat1 = radians(city1['LATITUDE'])
    lon1 = radians(city1['LONGITUDE'])
    lat2 = radians(city2['LATITUDE'])
    lon2 = radians(city2['LONGITUDE'])

    R = 6373.0

    dlon = lon2 - lon1
    dlat = lat2 - lat1

    a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlon / 2)**2
    c = 2 * atan2(sqrt(a), sqrt(1 - a))

    distance = R * c

    return distance

dist = np.zeros([lat_long.shape[0],lat_long.shape[0]])
for i1, city1 in lat_long.iterrows():
    for i2, city2 in lat_long.iloc[i1+1:,:].iterrows():
        dist[i1,i2] = distance(city1, city2)

print(dist)

输出

[[ 0.         20.51149047  8.41230771 15.32026132 50.17836849]
 [ 0.          0.         16.33997119 15.83407186 30.03192954]
 [ 0.          0.          0.          6.90864606 44.18376436]
 [ 0.          0.          0.          0.         40.02842872]
 [ 0.          0.          0.          0.          0.        ]]

距离矩阵的下三角是空的,因为矩阵是对称的(dist[i1,i2]==dist[i2,i1]

您可以使用pdist来计算成对距离:

import pandas as pd

import numpy as np
from math import sin, cos, sqrt, atan2, radians

from scipy.spatial.distance import pdist, squareform

lat_long = pd.DataFrame({'LATITUDE': [-22.98, -22.97, -22.92, -22.87, -22.89], 'LONGITUDE': [-43.19, -43.39, -43.24, -43.28, -43.67]})


def dist(x, y):
    """Function to compute the distance between two points x, y"""

    lat1 = radians(x[0])
    lon1 = radians(x[1])
    lat2 = radians(y[0])
    lon2 = radians(y[1])

    R = 6373.0

    dlon = lon2 - lon1
    dlat = lat2 - lat1

    a = sin(dlat / 2) ** 2 + cos(lat1) * cos(lat2) * sin(dlon / 2) ** 2
    c = 2 * atan2(sqrt(a), sqrt(1 - a))

    distance = R * c

    return round(distance, 4)


distances = pdist(lat_long.values, metric=dist)

points = [f'point_{i}' for i in range(1, len(lat_long) + 1)]

result = pd.DataFrame(squareform(distances), columns=points, index=points)

print(result)

输出

         point_1  point_2  point_3  point_4  point_5
point_1   0.0000  20.5115   8.4123  15.3203  50.1784
point_2  20.5115   0.0000  16.3400  15.8341  30.0319
point_3   8.4123  16.3400   0.0000   6.9086  44.1838
point_4  15.3203  15.8341   6.9086   0.0000  40.0284
point_5  50.1784  30.0319  44.1838  40.0284   0.0000

注意squareform将稀疏矩阵转换为密集矩阵,因此结果存储在numpy数组中

相关问题 更多 >