在两个二维数组中查找所有紧密的数值匹配

>>> arr1 = [ [19.21, 19.19], [13.18, 11.55], [21.45, 5.83] ] >>> arr2 = [ [13.11, 11.54], [19.20, 19.19], [51.21, 21.55], [19.22, 19.18], [11.21, 11.55] ] >>> find_close_match_indices(arr1, arr2, tol=0.1) [[0, 1], [0, 3], [1, 0]]

3条回答

网友

1楼 · 编辑于 2024-05-13 04:27:28

也许您会发现以下内容很有用。可能比@Tim Roberts的解决方案更快，因为没有显式for循环。但它将使用更多的存储空间

import numpy as np

xarr1 = np.array([
    [19.21, 19.19],
    [13.18, 11.55],
    [21.45,  5.83]
])
xarr2 = np.array([
    [13.11, 11.54],
    [19.20, 19.19],
    [51.21, 21.55],
    [19.22, 19.18],
    [11.21, 11.55]
])

tol=0.1
xarr1=xarr1[:,None,:]
xarr2=xarr2[None,:,:]
# broadcasting
cc = xarr2-xarr1
cc = np.apply_along_axis(np.linalg.norm,-1,cc)
# or you can use other metrics of closeness e.g. as below
#cc = np.apply_along_axis(np.abs,-1,cc) 
#cc = np.apply_along_axis(np.max,-1,cc)
id1,id2=np.where(cc<tol)

网友

2楼 · 编辑于 2024-05-13 04:27:28

我有了一个如何使用桶来解决这个问题的想法。其思想是根据元素的值和公差级别形成一个键。为了确保将存储桶“边缘”中的潜在匹配项与“边缘”中的其他元素进行比较，将比较所有相邻存储桶。最后，我修改了@Tim Roberts执行实际匹配的方法，使之在两列上都匹配

我把它做成了一个叫做close-numerical-matches的图书馆。示例用法：

>>> import numpy as np
>>> from close_numerical_matches import find_matches
>>> arr0 = np.array([[25, 24], [50, 50], [25, 26]])
>>> arr1 = np.array([[25, 23], [25, 25], [50.6, 50.6], [60, 60]])
>>> find_matches(arr0, arr1, tol=1.0001)
array([[0, 0], [0, 1], [1, 2], [2, 1]])
>>> find_matches(arr0, arr1, tol=0.9999)
array([[1, 2]])
>>> find_matches(arr0, arr1, tol=0.60001)
array([], dtype=int64)
>>> find_matches(arr0, arr1, tol=0.60001, dist='max')
array([[1, 2]])
>>> manhatten_dist = lambda arr: np.sum(np.abs(arr), axis=1)
>>> matches = find_matches(arr0, arr1, tol=0.11, dist=manhatten_dist)
>>> matches
array([[0, 1], [0, 1], [2, 1]])
>>> indices0, indices1 = matches.T
>>> arr0[indices0]
array([[25, 24], [25, 24], [25, 26]])

一些分析：

from timeit import default_timer as timer
import numpy as np
from close_numerical_matches import naive_find_matches, find_matches

arr0 = np.random.rand(320_000, 2)
arr1 = np.random.rand(44_000, 2)

start = timer()
naive_find_matches(arr0, arr1, tol=0.001)
end = timer()
print(end - start)  # 255.335 s

start = timer()
find_matches(arr0, arr1, tol=0.001)
end = timer()
print(end - start)  # 5.821 s

网友

3楼 · 编辑于 2024-05-13 04:27:28

不使用循环无法完成此操作，但可以通过利用布尔索引使用一个循环完成此操作：

import numpy as np

xarr1 = np.array([
    [19.21, 19.19],
    [13.18, 11.55],
    [21.45,  5.83]
])
xarr2 = np.array([
    [13.11, 11.54],
    [19.20, 19.19],
    [51.21, 21.55],
    [19.22, 19.18],
    [11.21, 11.55]
])

def find_close_match_indices(arr1, arr2, tol=0.1):
    results = []
    for i,r1 in enumerate(arr1[:,0]):
        x1 = np.abs(arr2[:,0]-r1) < tol
        results.extend( [i,k] for k in np.where(x1)[0] )
    return results

print(find_close_match_indices(xarr1,xarr2,0.1))

输出：

[[0, 1], [0, 3], [1, 0]]

相关问题更多 >

编程相关推荐

热门问题

热门文章