将多个numpy数组快速合并为一个数组

26 投票

1 回答

26416 浏览

数据工程师

提问于 2025-04-16 17:48

如果你知道一个列表的长度和所有数组的大小（都是一样的），那么把这些numpy数组合并成一个数组的最快方法是什么呢？

我尝试了两种方法：

merged_array = array(list_of_arrays)，这是从一个Pythonic的方式来从numpy数组列表创建一个numpy数组得到的。
vstack

从结果来看，vstack的速度更快，但奇怪的是，第一次运行的时间是第二次的三倍。我猜这是因为缺少预分配的原因。那么我该如何为vstack预分配一个数组呢？或者你知道更快的方法吗？

谢谢！

[更新]

我想要的是(25280, 320)而不是(80, 320, 320)，这意味着merged_array = array(list_of_arrays)对我来说不适用。感谢Joris指出这一点！！！

输出：

0.547468900681 s merged_array = array(first_list_of_arrays)
0.547191858292 s merged_array = array(second_list_of_arrays)
0.656183958054 s vstack first
0.236850976944 s vstack second

代码：

import numpy
import time
width = 320
height = 320
n_matrices=80

secondmatrices = list()
for i in range(n_matrices):
    temp = numpy.random.rand(height, width).astype(numpy.float32)
    secondmatrices.append(numpy.round(temp*9))

firstmatrices = list()
for i in range(n_matrices):
    temp = numpy.random.rand(height, width).astype(numpy.float32)
    firstmatrices.append(numpy.round(temp*9))


t1 = time.time()
first1=numpy.array(firstmatrices)
print time.time() - t1, "s merged_array = array(first_list_of_arrays)"

t1 = time.time()
second1=numpy.array(secondmatrices)
print time.time() - t1, "s merged_array = array(second_list_of_arrays)"

t1 = time.time()
first2 = firstmatrices.pop()
for i in range(len(firstmatrices)):
    first2 = numpy.vstack((firstmatrices.pop(),first2))
print time.time() - t1, "s vstack first"

t1 = time.time()
second2 = secondmatrices.pop()
for i in range(len(secondmatrices)):
    second2 = numpy.vstack((secondmatrices.pop(),second2))

print time.time() - t1, "s vstack second"

性能优化数据处理 numpy 计算效率数组操作数组合并列表长度预分配

1 个回答

你有80个320x320的数组吗？那么你可能想用一下 dstack：

first3 = numpy.dstack(firstmatrices)

这样做会返回一个80x320x320的数组，就像 numpy.array(firstmatrices) 返回的那样：

timeit numpy.dstack(firstmatrices)
10 loops, best of 3: 47.1 ms per loop


timeit numpy.array(firstmatrices)
1 loops, best of 3: 750 ms per loop

如果你想用 vstack，它会返回一个25600x320的数组：

timeit numpy.vstack(firstmatrices)
100 loops, best of 3: 18.2 ms per loop

回答于 2025-04-16 由 Python大师

分享举报

将多个numpy数组快速合并为一个数组

1 个回答

撰写回答