NumPy:从掩蔽的二维数组中找到阈值上下的排序索引
我有一个二维的带掩码的数值数组,想把里面的值从小到大排序。比如说:
import numpy as np
# Make a random masked array
>>> ar = np.ma.array(np.round(np.random.normal(50, 10, 20), 1),
mask=np.random.binomial(1, .2, 20)).reshape((4,5))
>>> print(ar)
[[-- 51.9 38.3 46.8 43.3]
[52.3 65.0 51.2 46.5 --]
[56.7 51.1 -- 38.6 33.5]
[45.2 56.8 74.1 58.4 56.4]]
# Sort the array from lowest to highest, with a flattened index
>>> sorted_ind = ar.argsort(axis=None)
>>> print(sorted_ind)
[14 2 13 4 15 8 3 11 7 1 5 19 10 16 18 6 17 0 12 9]
但是在得到排序后的索引后,我需要把这些索引分成两个简单的部分:一个是小于或等于某个给定的值,另一个是大于或等于这个值。而且,我不需要那些被掩码的值,得把它们去掉。举个例子,如果datum = 51.1
,我该怎么把sorted_ind
过滤成10个大于datum
的索引和8个小于datum
的索引?(注意:因为有等于的条件,所以会有一个索引是重复的。还有3个被掩码的值需要从分析中去掉)。我需要保留扁平化后的索引位置,因为我后面会用np.unravel_index(ind, ar.shape)
来处理这些索引。
2 个回答
3
准备工作:
>>> ar = np.ma.array(np.round(np.random.normal(50, 10, 20), 1),
mask=np.random.binomial(1, .2, 20)).reshape((4,5))
>>> print(ar)
[[59.9 51.3 -- 19.7 --]
[59.1 57.2 48.6 49.8 46.3]
[51.1 61.6 36.9 52.2 51.7]
[37.9 -- -- 53.1 57.5]]
>>> sorted_ind = ar.argsort(axis=None)
>>> sorted_ind
array([ 3, 12, 15, 9, 7, 8, 10, 1, 14, 13, 18, 6, 19, 5, 0, 11, 4,
2, 16, 17])
然后是新内容
>>> flat = ar.flatten()
>>> leq_ind = filter(lambda x: flat[x] <= 51.1, sorted_ind)
>>> leq_ind
[3, 12, 15, 9, 7, 8, 10]
>>> geq_ind = filter(lambda x: flat[x] >= 51.1, sorted_ind)
>>> geq_ind
[10, 1, 14, 13, 18, 6, 19, 5, 0, 11]
5
在这里使用的地方:
import numpy as np
np.random.seed(0)
# Make a random masked array
ar = np.ma.array(np.round(np.random.normal(50, 10, 20), 1),
mask=np.random.binomial(1, .2, 20)).reshape((4,5))
# Sort the array from lowest to highest, with a flattened index
sorted_ind = ar.argsort(axis=None)
tmp = ar.flatten()[sorted_ind]
print sorted_ind[np.ma.where(tmp<=51.0)]
print sorted_ind[np.ma.where(tmp>=51.0)]
但是因为tmp是排好序的,所以你可以使用np.searchsorted():
tmp = ar.flatten()[sorted_ind].compressed() # compressed() will delete all invalid data.
idx = np.searchsorted(tmp, 51.0)
print sorted_ind[:idx]
print sorted_ind[idx:len(tmp)]