键入cast error''uupyx\umemviewslice'到'double*'Cython，等价于什么？MKL函数prange cod

2024-05-28 20:34:31 发布

男 | 程序猿一只，喜欢编程写python代码。

我写了一个Cython程序，调用Intel MKL进行矩阵乘法，目的是使它并行。它是基于一个旧的SO post链接到BLAS的，使用了一堆我从未见过的Cython方法，但它运行起来比NumPy（也链接到MKL）慢得多。为了加快速度，我使用了典型的Memoryview格式（它使用ndarraynp.float64_t数据类型进行一些操作）。但现在它不再使用double[::1]内存视图。以下是生成的错误： 'type cast': cannot convert from '__Pyx_memviewslice' to 'double *'

由于类型转换不起作用，MKL函数只能看到5个参数中的3个： error C2660: 'cblas_ddot': function does not take 3 arguments

以下是.PYX代码：

import numpy as np
cimport numpy as np
cimport cython
from cython cimport view
from cython.parallel cimport prange     #this is your OpenMP portion
from openmp cimport omp_get_max_threads #only used for getting the max # of threads on the machine 

cdef extern from "mkl_cblas.h" nogil: #import a function from Intel's MKL library
    double ddot "cblas_ddot"(int N,
                             double *X, 
                             int incX,
                             double *Y, 
                             int incY)

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.cdivision(True)
cpdef matmult(double[:,::1] A, double[:,::1] B):
    cdef int Ashape0=A.shape[0], Ashape1=A.shape[1], Bshape0=B.shape[0], Bshape1=B.shape[1], Arowshape0=A[0,:].shape[0] #these are defined here as they aren't allowed in a prange loop

    if Ashape1 != Bshape1:
        raise TypeError('Inner dimensions are not consistent!')

    cdef int i, j
    cdef double[:,::1] out = np.zeros((Ashape0, Bshape1))
    cdef double[::1] A_row = np.zeros(Ashape0)
    cdef double[:] B_col = np.zeros(Bshape1) #no idea why this is not allowed to be [::1]
    cdef int Arowstrides = A_row.strides[0] // sizeof(double)
    cdef int Bcolstrides = B_col.strides[0] // sizeof(double)
    cdef int maxthreads = omp_get_max_threads()

    for i in prange(Ashape0, nogil=True, num_threads=maxthreads, schedule='static'): # to use all cores

        A_row = A[i,:]
        for j in range(Bshape1):
            B_col = B[:,j]
            out[i,j] = ddot(Arowshape0, #call the imported Intel MKL library
                            <double*>A_row,
                            Arowstrides, 
                            <double*>B_col,
                            Bcolstrides) 

return np.asarray(out)

我相信这是很容易被人指出的。如果你看到了可以改进的地方，请给出建议-这是被砍掉的，我认为甚至不需要I/j循环。关于：https://gist.github.com/JonathanRaiman/f2ce5331750da7b2d4e9的最干净的例子，我最后编译它的速度实际上快得多（2倍），但没有给出任何结果，所以我将把它放在另一篇文章中（这里：Calling BLAS / LAPACK directly using the SciPy interface and Cython - also how to add MKL）

非常感谢。在

Tags： the to from np cython int double shape

1条回答

网友

1楼 · 发布于 2024-05-28 20:34:31

要从memoryview获取指针，需要获取第一个元素的地址

ddot(Arowshape0, #call the imported Intel MKL library
                        &A_row[0],
                        Arowstrides, 
                        &B_col[0],
                        Bcolstrides)

键入cast error''uupyx\umemviewslice'到'double*'Cython，等价于什么？MKL函数prange cod

相关问题更多 >

编程相关推荐

热门问题

热门文章

键入cast error''uupyx\umemviewslice'到'double*'Cython，等价于什么？MKL函数prange cod

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >