如何有效率地比较二维矩阵中的每一对行？

2条回答

网友
1楼 · 编辑于 2024-04-26 20:24:16

由于矩阵的大小和问题的要求，我认为迭代是不可避免的。你不能利用广播，因为它会爆炸你的内存，所以你需要对现有的数组逐行操作。但是，您可以使用numba和njit来大大加快速度，而不是使用纯python方法。你知道吗
import numpy as np from numba import njit @njit def zero_out_contained_rows(a): """ Finds rows where all of the elements are equal or smaller than all corresponding elements of anothe row, and sets all values in the row to zero Parameters a: ndarray The array to modify Returns - The modified array Examples >>> zero_out_contained_rows(np.array([[1, 0, 1], [0, 1, 0], [1, 0, 0]])) array([[1, 0, 1], [0, 1, 0], [0, 0, 0]]) """ x, y = a.shape contained = np.zeros(x, dtype=np.bool_) for i in range(x): for j in range(x): if i != j and not contained[j]: equal = True for k in range(y): if a[i, k] < a[j, k]: equal = False break contained[j] = equal a[contained] = 0 return a
这会记录一行是否在另一行中使用。这可以在最后用0清除其他行中包含的行之前，通过短路防止许多不必要的比较。你知道吗
与使用迭代的初始尝试相比，这是一种速度改进，而且还可以处理将正确的行归零的问题。你知道吗
a = np.random.randint(0, 2, (6000, 6000)) %timeit zero_out_contained_rows(a) 1.19 s ± 1.87 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
我将更新计时一旦你的尝试完成运行（目前约10分钟）。你知道吗

网友
2楼 · 编辑于 2024-04-26 20:24:16

如果你有矩阵6000x6000比你需要的（6000*6000-6000）/2=17997000计算。你知道吗
而不是使用np.triu指数，您可以尝试使用一个生成器来生成矩阵的上三角-这样可以减少内存消耗。试试这个，也许会有帮助。。你知道吗
def indxs(lst): for i1, el1 in enumerate(lst): for el2 in lst[i1:][1:]: yield (el1, el2)

相关问题更多 >

编程相关推荐

热门问题

热门文章