我试图在内核中的for
循环中分配一些数组。内核如下所示:
@cuda.jit
def forcecudatiling(p_num,d_num,r,force):
threadsInBlock=cuda.blockDim.x
threadsInGrid=threadsInBlock*cuda.gridDim.x
tid=cuda.threadIdx.x + cuda.blockIdx.x*cuda.blockDim.x
tiles=p_num/cuda.blockDim.x + 1
shared_p_mx = cuda.shared.array(0,dtype=np.float32)
shared_p_my = cuda.shared.array(0,dtype=np.float32)
alpha=(1.5)
rho=(1.0)
beta=(1.5*(1.0+alpha))
for k in range(tid,p_num,threadsInGrid):
r_k=cuda.device_array((d_num,p_num))
forcetemp=cuda.device_array((d_num,p_num))
r_k[0,k]=r[0,k]
r_k[1,k]=r[1,k]
forcetemp[0,k]=0.0
forcetemp[1,k]=0.0
我试图分配的数组是r_k
和forcetemp
,但是使用上面的代码,我得到以下错误:
TypingError: Failed in nopython mode pipeline (step: nopython frontend) Unknown attribute 'device_array' of type Module()
File "", line 117: def forcecudatiling(p_num,d_num,r,force): for k in range(tid,p_num,threadsInGrid): r_k=cuda.device_array((d_num,p_num))
你不能那样做。Numba CUDA内核中没有内存分配或数组创建
相关问题 更多 >
编程相关推荐