调用cuMemcpyDtoH时,Numba guVectorize标签出现UNKNOWN_CUDA_ERROR

0 投票
1 回答
2850 浏览
提问于 2025-06-18 04:01

我正在尝试让自己写的Python双边滤波器在我的GPU上运行,但总是遇到错误,其中一个错误让我很困惑。当我运行代码时,我得到了

Call to cuMemcpyDtoH results in UNKNOWN_CUDA_ERROR

根据其他帖子来看,这似乎是和内存问题有关?但因为我并没有用cuda写代码或者处理内存(我只是添加了一些标签让它能在GPU上运行),所以我不太确定该怎么解决这个问题。难道我把代码转换成在GPU上运行的方式搞错了吗?

import numpy as np
import cv2
import sys
import math
import cmath
import tqdm
from numba import jit, cuda, vectorize, guvectorize, float64, int64

sIntesity = 12.0
sSpace = 16.0
diameter = 100

@guvectorize([(float64[:,:], float64[:,:])],  '(n,m)->(n,m)',target='cuda',nopython =True)
def apply_filter(img, filteredImage):

    #imh, imw = img.shape[:2]
    imh = 600
    imw = 600
    hd = int((diameter - 1) / 2)

    for h in range(hd, imh - hd):
        for w in range(hd, imw - hd):
            Wp = 0
            filteredPixel = 0
            radius = diameter // 2
            for x in range(0, diameter):
                for y in range(0, diameter):

                    currentX = w - (radius - x)
                    cureentY = h - (radius - y)

                    intensityDifferent = img[currentX][cureentY] - img[w][h]
                    intensity = (1.0/ (2 * math.pi * (sIntesity ** 2))* math.exp(-(intensityDifferent ** 2) / (2 * sIntesity ** 2)))
                    foo = (currentX - w) ** 2 + (cureentY - h) ** 2
                    distance = cmath.sqrt(foo)
                    smoothing = (1.0 / (2 * math.pi * (sSpace ** 2))) * math.exp( -(distance.real ** 2) / (2 * sSpace ** 2))
                    weight = intensity * smoothing
                    filteredPixel += img[currentX][cureentY] * weight
                    Wp += weight

            filteredImage[h][w] = int(round(filteredPixel / Wp))


if __name__ == "__main__":
    src = cv2.imread("messy2.png", cv2.IMREAD_GRAYSCALE)
    src = src.astype(float)
    filtered_image_own = np.zeros(src.shape)
    print(type(src),type(filtered_image_own))
    apply_filter(src, filtered_image_own)
    filtered_image_own = filtered_image_own.astype(np.uint8) 
    cv2.imwrite("filtered_image4.png", filtered_image_own)

相关问题:

  • 暂无相关问题
暂无标签

1 个回答

2

把代码从CUDA切换到CPU后,我发现代码里有个错误,它试图访问一个无效的索引。这个错误其实就是在告诉我哪里出了问题。

撰写回答