在循环中堆叠numpy数组的最快方法是什么？

result_arr = np.array([]) for label in labels_set: data = [index for index, value in enumerate(labels_list) if value == label] for i in data: sub_corpus.append(corpus[i]) data_sub_tfidf = vec.fit_transform(sub_corpus) data_transform = pca.fit_transform(data_sub_tfidf) #Append array sub_corpus = []

3条回答

网友

1楼 · 编辑于 2024-04-26 21:48:42

使用concatenate初始化“c”：

a = np.array([[8,3,1],[2,5,1],[6,5,2]])
b = np.array([[2,5,1],[2,5,2]])
matrix = [a,b]

c = np.empty([0,matrix[0].shape[1]])

for v in matrix:
    c = np.append(c, v, axis=0)

输出：

^{pr2}$

网友

2楼 · 编辑于 2024-04-26 21:48:42

@hpaulj想说什么

Stick with list append when doing loops.

是

#use a normal list
result_arr = []

for label in labels_set:

    data_transform = pca.fit_transform(data_sub_tfidf) 

    # append the data_transform object to that list
    # Note: this is not np.append(), which is slow here
    result_arr.append(data_transform)

# and stack it after the loop
# This prevents slow memory allocation in the loop. 
# So only one large chunk of memory is allocated since
# the final size of the concatenated array is known.

result_arr = np.concatenate(result_arr)

# or 
result_arr = np.stack(result_arr, axis=0)

# or
result_arr = np.vstack(result_arr)

你的数组实际上没有不同的维度。它们有一个不同的维度，另一个维度是相同的。在这种情况下，你总是可以沿着“不同”维度堆叠。在

网友

3楼 · 编辑于 2024-04-26 21:48:42

如果您有一个大小为(40, 2)的数组b和一个大小为(175,2)的数组b，那么您只需使用np.concatenate([a,b])得到一个大小为(215, 2)的最终数组。在

相关问题更多 >

编程相关推荐

热门问题

热门文章