按列获取真值行索引的最有效方法

arr = np.random.randint(2, size=15).reshape((3,5)).astype(bool) print arr [[ True False True False True] [False True False True True] [ True True False False True]] def calc(matrix): result = [] for i in range(matrix.shape[1]): result.append(np.argwhere(matrix[:, i]).flatten().tolist()) return result print calc(arr) [[0, 2], [1, 2], [0], [1], [0, 1, 2]]

2条回答

网友

1楼 · 编辑于 2024-04-20 05:44:10

我的解决办法是

collumn, row = np.where(arr.T)
unique, indices = np.unique(collumn, return_index=True)
np.split(row, indices[1:])

这比@Divakar提出的要慢一点。不过，我发现它更具可读性，因为可以避免复杂的np.flatnonzero(r[1:] != r[:-1])+1部分，因此可以立即清楚地看到发生了什么。在

网友

2楼 · 编辑于 2024-04-20 05:44:10

方法1

这里有一个矢量化NumPy方法，可以将这些行索引分组到数组列表中-

r,c = np.where(arr.T)
out = np.split(c, np.flatnonzero(r[1:] != r[:-1])+1)

样本运行-

^{pr2}$

方法2

或者，我们可以使用loop comprehension来避免这种分裂-

idx = np.concatenate(([0], np.flatnonzero(r[1:] != r[:-1])+1, [r.size] ))
out = [c[idx[i]:idx[i+1]] for i in range(len(idx)-1)]

我们正在使用方法1中的r,c。在

方法3（为所有0col输出空列表/数组）

为了解释所有的零列，我们需要空列表/数组，这里有一个改进的方法-

idx = np.concatenate(([0], arr.sum(0).cumsum() ))
out = [c[idx[i]:idx[i+1]] for i in range(len(idx)-1)]

我们正在使用方法1中的c。在

样本运行-

In [177]: arr
Out[177]: 
array([[ True, False, False, False, False],
       [ True, False, False, False,  True],
       [ True, False,  True, False,  True]], dtype=bool)

In [178]: idx = np.concatenate(([0], arr.sum(0).cumsum() ))
     ...: out = [c[idx[i]:idx[i+1]] for i in range(len(idx)-1)]
     ...: 

In [179]: out
Out[179]: 
[array([0, 1, 2]),
 array([], dtype=int64),
 array([2]),
 array([], dtype=int64),
 array([1, 2])]

方法4

这里有另一种处理所有0scols的方法-

unq, IDs = np.unique(r, return_index=1)
idx = np.concatenate(( IDs, [r.size] ))
out = [[]]*arr.shape[1]
for i,item in enumerate(unq):
    out[item] = c[idx[i]:idx[i+1]]

我们正在使用方法1中的r,c。在

相关问题更多 >

编程相关推荐

热门问题

热门文章