Numpy数组：有效使用包含索引的数组

1条回答

网友

1楼 · 发布于 2024-04-26 00:20:38

更简单的2d案例：

In [48]: index1=np.array([1,1,2,2,3,3,4,4]);
     index2=np.array([0,2,1,2,3,4,4,5])
In [49]: data=np.arange(1,9)
In [50]: target=np.zeros((5,6))
In [53]: target[index1,index2]=data

In [54]: target
Out[54]: 
array([[ 0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  0.,  2.,  0.,  0.,  0.],
       [ 0.,  3.,  4.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  5.,  6.,  0.],
       [ 0.,  0.,  0.,  0.,  7.,  8.]])

如果“ravel”索引，可以使用put或target.flat：

In [51]: flatindex=np.ravel_multi_index((index1,index2),target.shape)
In [52]: flatindex
Out[52]: array([ 6,  8, 13, 14, 21, 22, 28, 29], dtype=int32)
In [58]: np.put(target,flatindex,data)
In [61]: target.flat[flatindex]=data

一些快速时间比较（对于=data，而不是+=data）：

In [63]: timeit target[index1,index2]=data
100000 loops, best of 3: 6.63 µs per loop

In [64]: timeit np.put(target,flatindex,data)
100000 loops, best of 3: 2.47 µs per loop

In [65]: timeit target.flat[flatindex]=data
100000 loops, best of 3: 2.77 µs per loop

In [66]: %%timeit
   ....: flatindex=np.ravel_multi_index((index1,index2),target.shape)
   ....: target.flat[flatindex]=data
   ....: 
100000 loops, best of 3: 7.34 µs per loop

target.flat[]=是赢家-如果raveled索引已经可用。如果对相同的索引数组重复应用此计算，则可能会出现这种情况。请记住，小阵列上的时间测试在大阵列上的伸缩性可能不同。你知道吗

用+=代替，put不起作用。flat具有速度优势，即使必须计算ravel：

In [78]: timeit target[index1,index2]+=data
100000 loops, best of 3: 16.2 µs per loop

In [79]: timeit target.flat[flatindex]+=data
100000 loops, best of 3: 7.45 µs per loop

In [80]: %%timeit                          
flatindex=np.ravel_multi_index((index1,index2),target.shape)
target.flat[flatindex]+=data
   ....: 
100000 loops, best of 3: 13.4 µs per loop

但是-如果索引有重复，并且您希望添加所有data值，那么问题会发生显著变化。像这样的直接索引使用缓冲，所以只有最后添加的一个点才适用。你知道吗

有关缓冲问题和替代方案的讨论，请参阅最近的SO问题

Vector operations with numpy

相关问题更多 >

编程相关推荐

热门问题

热门文章

Numpy数组：有效使用包含索引的数组

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >