如何加速python循环

网友

1楼 · 编辑于 2024-05-15 20:27:38

这不容易进一步矢量化（就我所见），除非id有某种结构。否则，瓶颈可能经常出现id==dummy，但我能想到的唯一解决方案是使用排序，并且由于缺少np.最大值（）仍然需要相当多的python代码（编辑：实际上有一个reduce函数通过np.fmax公司可用）。对于x为1000x1000且id/id为0..100的情况，这大约快了3倍，但由于其相当复杂，它只值得用于具有许多id的较大问题：

def max_at_ids(x, id, ids):
    # create a 1D view of x and id:
    r_x = x.ravel()
    r_id = id.ravel()
    sorter = np.argsort(r_id)

    # create new sorted arrays:
    r_id = r_id[sorter]; r_x = r_x[sorter]

    # unfortunatly there is no reduce functionality for np.max...

    ids = np.unique(ids) # create a sorted, unique copy, just in case

    # w gives the places where the sorted arrays id changes:
    w = np.where(r_id[:-1] != r_id[1:])[0] + 1

我最初提供的解决方案是在切片上执行纯python循环，但下面是一个更短（更快）的版本：

^{pr2}$

编辑：每个切片最大值计算的改进版本：

有一种方法可以通过使用np.fmax.reduceat来删除python循环，如果切片很小（实际上非常优雅），那么它可能会比前一个要好得多：

# just to 0 at the start of w
# (or calculate first slice by hand and use out=... keyword argument to avoid even
# this copy.
w = np.concatenate(([0], w))
max_x = np.fmin.reduceat(r_x, w)
return ids, max_x

现在可能有一些小的东西可以让它更快一点。如果id/ids有某种结构，那么应该可以简化代码，并且可以使用不同的方法来实现更大的加速。否则，这段代码的加速应该很大，只要有很多（唯一的）id（并且x/id数组不是很小）。请注意，代码强制np.唯一（ids），这可能是一个很好的假设。在

网友

2楼 · 编辑于 2024-05-15 20:27:38

{{1}应该放弃一些cd2}而不是使用cd2}。在

网友

3楼 · 编辑于 2024-05-15 20:27:38

scipy.ndimage.maximum正是这样做的：

import numpy as np
from scipy import ndimage as nd

N = 100  # number of values
K = 10   # number of class

# generate random data
x   = np.random.rand(N)
ID  = np.random.randint(0,K,N)  # random id class for each xi's
ids = np.random.randint(0,K,5)  # select 5 random class

# do what you ask
max_per_id = nd.maximum(x,labels=ID,index=ids)

print dict(zip(ids,max_per_id))

如果要计算所有ID的最大值，请执行ids = ID

注意，如果ids中的某个特定类在ID中找不到（即该类没有标记x），则该类的最大返回值为0。在

相关问题更多 >

编程相关推荐

热门问题

热门文章