如何根据条件选择数组元素?
假设我有一个numpy数组 x = [5, 2, 3, 1, 4, 5]
,还有一个数组 y = ['f', 'o', 'o', 'b', 'a', 'r']
。我想从 y
中选出那些和 x
中大于1且小于5的元素对应的元素。
我试过这样做:
x = array([5, 2, 3, 1, 4, 5])
y = array(['f','o','o','b','a','r'])
output = y[x > 1 & x < 5] # desired output is ['o','o','a']
但是这样不行。我该怎么做呢?
6 个回答
25
补充一下@J.F. Sebastian和@Mark Mikofski的回答:
如果你想获取对应的索引(而不是数组的实际值),可以使用以下代码:
满足多个(全部)条件的情况:
select_indices = np.where( np.logical_and( x > 1, x < 5) )[0] # 1 < x <5
满足多个(任意)条件的情况:
select_indices = np.where( np.logical_or( x < 1, x > 5 ) )[0] # x <1 or x >5
44
我觉得提问者其实并不想要使用np.bitwise_and()
(也就是&
),而是想要用np.logical_and()
,因为他们在比较的是逻辑值,比如True
和False
。想了解这两者的区别,可以看看这个关于逻辑与位运算的帖子。
>>> x = array([5, 2, 3, 1, 4, 5])
>>> y = array(['f','o','o','b','a','r'])
>>> output = y[np.logical_and(x > 1, x < 5)] # desired output is ['o','o','a']
>>> output
array(['o', 'o', 'a'],
dtype='|S1')
还有一种等效的方法是使用np.all()
,只需要适当地设置axis
参数。
>>> output = y[np.all([x > 1, x < 5], axis=0)] # desired output is ['o','o','a']
>>> output
array(['o', 'o', 'a'],
dtype='|S1')
具体来说:
>>> %timeit (a < b) & (b < c)
The slowest run took 32.97 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 1.15 µs per loop
>>> %timeit np.logical_and(a < b, b < c)
The slowest run took 32.59 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.17 µs per loop
>>> %timeit np.all([a < b, b < c], 0)
The slowest run took 67.47 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 5.06 µs per loop
使用np.all()
会稍微慢一些,但&
和logical_and
的速度差不多。
270
你的表达式如果加上括号就能正常工作:
>>> y[(1 < x) & (x < 5)]
array(['o', 'o', 'a'],
dtype='|S1')