查找numpy数组中非X或Y的某个值的首次出现的索引

2条回答

网友

1楼 · 编辑于 2024-05-23 13:47:34

import numpy as np
vec = np.array([2, 2, 2, 51, 51, 52, 52, 14, 14, 14, 51, 51, 52, 52])

first_occurrence = []
for x in np.unique(vec):
    if x not in [51,52]:
        first_occurrence.append(np.argmax(x==vec))

^{}查找布尔数组x==vec中第一次出现的最大值（即True）的索引。由于x来自vec，因此保证至少有一个True值

性能取决于vec的大小和要查找的值的数量。对于较大的数组，这种简单循环方法（蓝色）优于accepted answer（绿色和橙色），特别是对于在示例中找到的少量值（对于给定的玩具示例，它实际上快了1.7倍）（source）。事实证明，将unique与index=True一起使用相对较慢，较大数组的另一个因素是掩码的内存分配

网友

2楼 · 编辑于 2024-05-23 13:47:34

^{}如果指定return_index=True，则返回每个数字的第一个索引。您可以很容易地使用^{}过滤结果，例如：

u, i =  np.unique(vec, return_index=True)
result = i[np.isin(u, [51, 52], invert=True)]

这样做的好处是u与原始数据相比，大大减少了搜索空间。与显式否定生成的掩码相比，使用invert=True也会稍微加快速度

依赖于数据已排序这一事实的np.isin版本可以使用^{}如下所示：

def isin_sorted(a, i, invert=False):
    ind = np.searchsorted(a, i)
    ind = ind[a[ind.clip(max=a.size)] == i]
    if invert:
        mask = np.ones(a.size, dtype=bool)
        mask[ind] = False
    else:
        mask = np.zeros(a.size, dtype=bool)
        mask[ind] = True
    return mask

在调用np.unique之后，您可以使用此版本来代替np.isin，它总是返回一个排序数组。对于足够大的vec和排除列表，它将更有效：

result = i[isin_sorted(u, [51, 52], invert=True)]

相关问题更多 >

编程相关推荐

热门问题

热门文章