为什么我的GPU在矩阵运算方面比CPU慢？

import numpy as np import cupy as cp import time ### Numpy and CPU s = time.time() A = np.random.random([10000,10000]); B = np.random.random([10000,10000]) CPU = np.matmul(A,B); CPU *= 5 e = time.time() print(f'CPU time: {e - s: .2f}') ### CuPy and GPU s = time.time() C= cp.random.random([10000,10000]); D = cp.random.random([10000,10000]) GPU = cp.matmul(C,D); GPU *= 5 cp.cuda.Stream.null.synchronize() # to let the code finish executing on the GPU before calculating the time e = time.time() print(f'GPU time: {e - s: .2f}')

1条回答

网友

1楼 · 发布于 2024-04-27 04:00:54

numpy random正在生成浮点（32位）作为默认值。默认情况下，Cupy random生成64位（双精度）。要进行苹果对苹果的比较，请按如下方式更改GPU随机数生成：

C= cp.random.random([10000,10000], dtype=cp.float32)
D = cp.random.random([10000,10000], dtype=cp.float32)

我有不同的硬件（CPU和GPU）比你，但一旦这个变化是作出的GPU版本约12倍快于CPU版本。使用cupy生成随机数数组、矩阵乘法和标量乘法总共不到1秒

相关问题更多 >

编程相关推荐

热门问题

热门文章