比较两个点元组列表的更快方法？

4 投票

3 回答

2772 浏览

数据工程师

提问于 2025-04-16 19:08

我有两个列表（它们的长度可能相同，也可能不同）。每个列表里都有一系列的元组，每个元组包含两个点（基本上就是X和Y的值）。

我正在比较这两个列表，想找出两个点的值相似的情况。我试过用列表推导的方法，但里面嵌套的元组让我搞得很混乱，最后也没能成功。

这样做是最好的（最快的）方法吗？我觉得可能有更符合Python风格的方法。

假设我有两个列表：

pointPairA = [(2,1), (4,8)]
pointPairB = [(3,2), (10,2), (4,2)]

然后我准备了一个空列表来存储匹配的对，还有一个容差值，用来只存储匹配的对。

matchedPairs = []
tolerance = 2

接着我写了一个循环，解包元组，比较它们的差值，然后把匹配的对添加到matchedPairs列表中，以表示找到匹配。

for pointPairA in pointPairListA:
    for pointPairB in pointPairListB:
        ## Assign the current X,Y values for each pair
        pointPairA_x, pointPairA_y = pointPairA
        pointPairB_x, pointPairB_x = pointPairB

        ## Get the difference of each set of points
        xDiff = abs(pointPairA_x - pointPairB_x)
        yDiff = abs(pointPairA1_y - pointPairB_y)

        if xDiff < tolerance and yDiff < tolerance:
            matchedPairs.append((pointPairA, pointPairB))

这样做的结果是matchedPairs看起来像这样，里面包含了两个点的元组：

[( (2,1), (3,2) ), ( (2,1), (4,2) )]

算法优化空列表列表推导元组比较数据匹配差值计算元组解包容差值

3 个回答

使用列表推导式：

[(pa, pb) for pa in pointPairA for pb in pointPairB \
          if abs(pa[0]-pb[0]) <= tolerance and abs(pa[1]-pb[1]) <= tolerance]

比你用循环的方法稍微快一点：

(for 1 million executions)

>>> (list comprehension).timeit()
2.1963138580322266 s

>>> (your method).timeit()
2.454944133758545 s

回答于 2025-04-16 由 Python大师

分享举报

如果这些列表很大，我建议找一个更快的算法...

我会先把两个包含成对数据的列表按每对中两个数字（x和y）的和进行排序。因为只有当两个点的和接近时，它们才可能相互接近。

对于第一个列表中的任何一个点，这样做会大大缩小你在第二个列表中需要搜索的范围。你可以在第二个列表上保持一个“滑动窗口”，这个窗口对应的元素的和要在当前第一个列表元素的和的2*tolerance范围内。（实际上，你只需要记录滑动窗口的起始位置就可以了...）

假设tolerance的值比较小，这样就可以把你的O(n^2)的操作变成O(n log n)的操作。

回答于 2025-04-16 由 Python大师

分享举报

这里的pointpairA是一个单独的列表，而pointpairB则是20,000个列表中的一个。

from collections import defaultdict
from itertools import product

pointPairA = [(2,1), (4,8)]
pointPairB = [(3,2), (10,2), (4,2)]
tolerance = 2

dA = defaultdict(list)
tolrange = range(-tolerance, tolerance+1)
for pA, dx, dy in product(pointPairA, tolrange, tolrange):
    dA[pA[0]+dx,pA[1]+dy].append(pA)

# you would have a loop here though the 20k lists
matchedPairs = [(pA, pB) for pB in pointPairB for pA in dA[pB]]  

print matchedPairs

回答于 2025-04-16 由 Python大师

分享举报

比较两个点元组列表的更快方法？

3 个回答

撰写回答