根据最大值过滤numpy数组

3条回答

网友

1楼 · 编辑于 2024-05-15 10:11:27

我看到你已经在评论中提到熊猫了。FWIW，这里介绍了如何获得所需的行为，假设您不关心最终的排序顺序，因为groupby更改了它。在

In [14]: arr
Out[14]:
array([[ 0.7732126 ,  0.48649481,  0.29771819,  0.91622924],
       [ 0.7732126 ,  0.48649481,  0.29771819,  1.91622924],
       [ 0.58294263,  0.32025559,  0.6925856 ,  0.0524125 ],
       [ 0.58294263,  0.32025559,  0.6925856 ,  0.05      ],
       [ 0.58294263,  0.32025559,  0.6925856 ,  1.7       ],
       [ 0.3239913 ,  0.7786444 ,  0.41692853,  0.10467392],
       [ 0.12080023,  0.74853649,  0.15356663,  0.4505753 ],
       [ 0.13536096,  0.60319054,  0.82018125,  0.10445047],
       [ 0.1877724 ,  0.96060999,  0.39697999,  0.59078612]])

In [15]: import pandas as pd

In [16]: pd.DataFrame(arr)
Out[16]:
          0         1         2         3
0  0.773213  0.486495  0.297718  0.916229
1  0.773213  0.486495  0.297718  1.916229
2  0.582943  0.320256  0.692586  0.052413
3  0.582943  0.320256  0.692586  0.050000
4  0.582943  0.320256  0.692586  1.700000
5  0.323991  0.778644  0.416929  0.104674
6  0.120800  0.748536  0.153567  0.450575
7  0.135361  0.603191  0.820181  0.104450
8  0.187772  0.960610  0.396980  0.590786

In [17]: pd.DataFrame(arr).groupby([0,1,2]).max().reset_index()
Out[17]:
          0         1         2         3
0  0.120800  0.748536  0.153567  0.450575
1  0.135361  0.603191  0.820181  0.104450
2  0.187772  0.960610  0.396980  0.590786
3  0.323991  0.778644  0.416929  0.104674
4  0.582943  0.320256  0.692586  1.700000
5  0.773213  0.486495  0.297718  1.916229

网友

2楼 · 编辑于 2024-05-15 10:11:27

您可以从^{}输入数组开始，将具有相同的前三个元素的条目依次放入。然后，创建另一个2D数组来存储最后的列条目，这样与每个重复的三元组对应的元素进入相同的行。接下来，找到这个2D数组的max，并为每个这样唯一的三元组得到最终的max输出。下面是实现，假设A作为输入数组-

# Lex sort A
sortedA = A[np.lexsort(A[:,:-1].T)]

# Mask of start of unique first three columns from A
start_unqA = np.append(True,~np.all(np.diff(sortedA[:,:-1],axis=0)==0,axis=1))

# Counts of unique first three columns from A
counts = np.bincount(start_unqA.cumsum()-1)
mask = np.arange(counts.max()) < counts[:,None]

# Group A's last column into rows based on uniqueness from first three columns
grpA = np.empty(mask.shape)
grpA.fill(np.nan)
grpA[mask] = sortedA[:,-1]

# Concatenate unique first three columns from A and 
# corresponding max values for each such unique triplet
out = np.column_stack((sortedA[start_unqA,:-1],np.nanmax(grpA,axis=1)))

样本运行-

^{pr2}$

网友

3楼 · 编辑于 2024-05-15 10:11:27

这是复杂的，但它可能是最好的，你将得到使用纽比只。。。在

首先，我们使用^{}将具有相同坐标的所有条目放在一起。使用a作为示例数组：

>>> perm = np.lexsort(a[:, 3::-1].T)
>>> a[perm]
array([[ 0.12080023,  0.74853649,  0.15356663,  0.4505753 ],
       [ 0.7732126 ,  0.48649481,  0.29771819,  0.91622924],
       [ 0.7732126 ,  0.48649481,  0.29771819,  1.91622924],
       [ 0.1877724 ,  0.96060999,  0.39697999,  0.59078612],
       [ 0.3239913 ,  0.7786444 ,  0.41692853,  0.10467392],
       [ 0.58294263,  0.32025559,  0.6925856 ,  0.0524125 ],
       [ 0.58294263,  0.32025559,  0.6925856 ,  0.05      ],
       [ 0.58294263,  0.32025559,  0.6925856 ,  1.7       ],
       [ 0.13536096,  0.60319054,  0.82018125,  0.10445047]])

注意，通过反转轴，我们按x排序，断开与y的联系，然后z，然后{}。在

因为这是我们要寻找的最大值，所以我们只需要在每个组中取最后一个条目，这是一件非常简单的事情：

^{pr2}$

{如果你不能按照原始数组的顺序来排序，那么你也可以按照原来的顺序排列它们：

>>> a_unique_max[np.argsort(perm[last])]
array([[ 0.7732126 ,  0.48649481,  0.29771819,  1.91622924],
       [ 0.58294263,  0.32025559,  0.6925856 ,  1.7       ],
       [ 0.3239913 ,  0.7786444 ,  0.41692853,  0.10467392],
       [ 0.12080023,  0.74853649,  0.15356663,  0.4505753 ],
       [ 0.13536096,  0.60319054,  0.82018125,  0.10445047],
       [ 0.1877724 ,  0.96060999,  0.39697999,  0.59078612]])

这只会最大限度地发挥作用，它是排序的副产品。如果您使用的是不同的函数，比如所有相同坐标项的乘积，您可以执行以下操作：

>>> first = np.concatenate(([True],
                            np.all(a_sorted[:-1, :3] != a_sorted[1:, :3], axis=1)))
>>> a_unique_prods = np.multiply.reduceat(a_sorted, np.nonzero(first)[0])

你需要花一点时间来处理这些结果来组装你的返回数组。在

相关问题更多 >

编程相关推荐

热门问题

热门文章