为什么代码加速不能与Cython一起工作？问题的回答

为什么代码加速不能与Cython一起工作？

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我需要把这个代码加速到4毫秒 <pre><code>import numpy as np def return_call(data): num = int(data.shape[0] / 4096) buff_spectrum = np.empty(2048,dtype= np.uint64) buff_detect = np.empty(2048,dtype= np.uint64) end_spetrum = np.empty(num*1024,dtype=np.uint64) end_detect = np.empty(num*1024,dtype= np.uint64) _data = np.reshape(data,(num,4096)) for _raw_data_spec in _data: raw_data_spec = np.reshape(_raw_data_spec,(2048,2)) for i in range(2048): buff_spectrum[i] = (np.int16(raw_data_spec[i][0])<<17)|(np.int16(raw_data_spec[i][1] <<1))>>1 buff_detect[i] = (np.int16(raw_data_spec[i][0])>>15) for i in range (511,-1,-1): if buff_spectrum[i+1024] != 0: end_spetrum[i]=(np.log10(buff_spectrum[i+1024])) end_detect[i]=buff_detect[i+1024] else: end_spetrum[i] =0 end_detect[i] = 0 for i in range(1023, 511, -1): if buff_spectrum[i+1024] != 0: end_spetrum[i] = (np.log10(buff_spectrum[i + 1024])) end_detect[i] = buff_detect[i + 1024] else: end_spetrum[i] = 0 end_detect[i] = 0 return end_spetrum, end_detect </code></pre> 我决定用Cython来完成这项任务。但我没有得到任何加速 <pre><code>import numpy as np cimport numpy ctypedef signed short DTYPE_t cpdef return_call(numpy.ndarray[DTYPE_t, ndim=1] data): cdef int i cdef int num = data.shape[0]/4096 cdef numpy.ndarray _data cdef numpy.ndarray[unsigned long long, ndim=1] buff_spectrum = np.empty(2048,dtype= np.uint64) cdef numpy.ndarray[ unsigned long long, ndim=1] buff_detect = np.empty(2048,dtype= np.uint64) cdef numpy.ndarray[double , ndim=1] end_spetrum = np.empty(num*1024,dtype= np.double) cdef numpy.ndarray[double , ndim=1] end_detect = np.empty(num*1024,dtype= np.double) _data = np.reshape(data,(num,4096)) for _raw_data_spec in _data: raw_data_spec = np.reshape(_raw_data_spec,(2048,2)) for i in range(2048): buff_spectrum[i] = (np.uint16(raw_data_spec[i][0])<<17)|(np.uint16(raw_data_spec[i][1] <<1))>>1 buff_detect[i] = (np.uint16(raw_data_spec[i][0])>>15) for i in range (511,-1,-1): if buff_spectrum[i+1024] != 0: end_spetrum[i]=(np.log10(buff_spectrum[i+1024])) end_detect[i]=buff_detect[i+1024] else: end_spetrum[i] =0 end_detect[i] = 0 for i in range(1023, 511, -1): if buff_spectrum[i+1024] != 0: end_spetrum[i] = (np.log10(buff_spectrum[i + 1024])) end_detect[i] = buff_detect[i + 1024] else: end_spetrum[i] = 0 end_detect[i] = 0 return end_spetrum, end_detect </code></pre> 我达到的最大速度是80毫秒，但我需要更快。因为您需要几乎实时地处理来自铁的数据告诉我原因。实现预期结果是否现实。我还附上了测试文件的代码 <pre><code> import numpy as np import example_original import example_cython data = np.empty(8192*2, dtype=np.int16) import time startpy = time.time() example_original.return_call(data) finpy = time.time() -startpy startcy = time.time() k,r = example_cython.return_call(data) fincy = time.time() -startcy print( fincy, finpy) print('Cython is {}x faster'.format(finpy/fincy)) </code></pre>

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

为什么代码加速不能与Cython一起工作？

1 个回答

相关Python问题