回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>我正在python模块中优化一些代码。我已经锁定了瓶颈,并且是一个代码片段,它在<code>numpy</code>中执行一些计算。即以下代码:</p>
<pre><code> xh = np.multiply(K_Rinv[0, 0], x )
xh += np.multiply(K_Rinv[0, 1], y)
xh += np.multiply(K_Rinv[0, 2], h)
yh = np.multiply(K_Rinv[1, 0], x)
yh += np.multiply(K_Rinv[1, 1], y)
yh += np.multiply(K_Rinv[1, 2], h)
q = np.multiply(K_Rinv[2, 0], x)
q += np.multiply(K_Rinv[2, 1], y)
q += np.multiply(K_Rinv[2, 2], h)
</code></pre>
<p>其中x、y和h是具有形状的np.ndarray(42065749),而<code>K_Rinv</code>是具有形状的<code>np.ndarray</code>。
此代码段被多次调用,占用了整个代码的50%以上的时间。
有没有办法加快速度?或者它只是它现在的样子,不能被加速</p>
<p><strong>Edit1:</strong><br/>
谢谢你的回答。在使用numba时遇到问题(请参阅最后的错误消息),我尝试了使用numexpr的建议。但是,使用此解决方案时,我的代码被破坏。所以我检查了结果是否相同,它们是否不同。以下是我正在使用的代码:</p>
<pre><code> xh_1 = numexpr.evaluate('a1*b1+a2*b2+a3*b3', {'a1': K_Rinv[0, 0], 'b1': x,
'a2': K_Rinv[0, 1], 'b2': y,
'a3': K_Rinv[0, 2], 'b3': h})
yh_1 = numexpr.evaluate('a1*b1+a2*b2+a3*b3', {'a1': K_Rinv[1, 0], 'b1': x,
'a2': K_Rinv[1, 1], 'b2': y,
'a3': K_Rinv[1, 2], 'b3': h})
q_1 = numexpr.evaluate('a1*b1+a2*b2+a3*b3', {'a1': K_Rinv[2, 0], 'b1': x,
'a2': K_Rinv[2, 1], 'b2': y,
'a3': K_Rinv[2, 2], 'b3': h})
xh_2 = np.multiply(K_Rinv[0, 0], x )
xh_2 += np.multiply(K_Rinv[0, 1], y)
xh_2 += np.multiply(K_Rinv[0, 2], h)
yh_2 = np.multiply(K_Rinv[1, 0], x)
yh_2 += np.multiply(K_Rinv[1, 1], y)
yh_2 += np.multiply(K_Rinv[1, 2], h)
q_2 = np.multiply(K_Rinv[2, 0], x)
q_2 += np.multiply(K_Rinv[2, 1], y)
q_2 += np.multiply(K_Rinv[2, 2], h)
check1 = xh_1.all() == xh_2.all()
check2 = yh_1.all() == yh_2.all()
check3 = q_2.all() == q_1.all()
print ( " Check1 :{} , Check2: {} , Check3:{}" .format (check1,check2,check3))
</code></pre>
<p>我对numexpr没有任何经验,通常情况下它们是不一样的吗</p>
<p>来自numba的错误:</p>
<pre><code> File "/usr/local/lib/python3.6/dist-packages/numba/dispatcher.py", line 420, in _compile_for_args
raise e
File "/usr/local/lib/python3.6/dist-packages/numba/dispatcher.py", line 353, in _compile_for_args
return self.compile(tuple(argtypes))
File "/usr/local/lib/python3.6/dist-packages/numba/compiler_lock.py", line 32, in _acquire_compile_lock
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/numba/dispatcher.py", line 768, in compile
cres = self._compiler.compile(args, return_type)
File "/usr/local/lib/python3.6/dist-packages/numba/dispatcher.py", line 77, in compile
status, retval = self._compile_cached(args, return_type)
File "/usr/local/lib/python3.6/dist-packages/numba/dispatcher.py", line 91, in _compile_cached
retval = self._compile_core(args, return_type)
File "/usr/local/lib/python3.6/dist-packages/numba/dispatcher.py", line 109, in _compile_core
pipeline_class=self.pipeline_class)
File "/usr/local/lib/python3.6/dist-packages/numba/compiler.py", line 551, in compile_extra
return pipeline.compile_extra(func)
File "/usr/local/lib/python3.6/dist-packages/numba/compiler.py", line 327, in compile_extra
raise e
File "/usr/local/lib/python3.6/dist-packages/numba/compiler.py", line 321, in compile_extra
ExtractByteCode().run_pass(self.state)
File "/usr/local/lib/python3.6/dist-packages/numba/untyped_passes.py", line 67, in run_pass
bc = bytecode.ByteCode(func_id)
File "/usr/local/lib/python3.6/dist-packages/numba/bytecode.py", line 215, in __init__
self._compute_lineno(table, code)
File "/usr/local/lib/python3.6/dist-packages/numba/bytecode.py", line 237, in _compute_lineno
known = table[_FIXED_OFFSET].lineno
KeyError: 2
</code></pre>
<p><strong>Edit2</strong>
欢迎评论和回答。
所以在阅读完代码之后,numexpr解决方案就可以工作了。非常感谢。我仍然在一个单独的python文件中进行了一些测试,以查看numba代码是否在那里工作,但速度非常慢。请参阅下面我使用的代码:</p>
<pre><code>import numpy as np
import numba as nb
import numexpr
from datetime import datetime
def calc(x,y,h,K_Rinv):
xh_2 = np.multiply(K_Rinv[0, 0], x )
xh_2 += np.multiply(K_Rinv[0, 1], y)
xh_2 += np.multiply(K_Rinv[0, 2], h)
yh_2 = np.multiply(K_Rinv[1, 0], x)
yh_2 += np.multiply(K_Rinv[1, 1], y)
yh_2 += np.multiply(K_Rinv[1, 2], h)
q_2 = np.multiply(K_Rinv[2, 0], x)
q_2 += np.multiply(K_Rinv[2, 1], y)
q_2 += np.multiply(K_Rinv[2, 2], h)
return xh_2, yh_2, q_2
def calc_numexpr(x,y,h,K_Rinv):
xh = numexpr.evaluate('a1*b1+a2*b2+a3*b3', {'a1': K_Rinv[0, 0], 'b1': x,
'a2': K_Rinv[0, 1], 'b2': y,
'a3': K_Rinv[0, 2], 'b3': h})
yh = numexpr.evaluate('a1*b1+a2*b2+a3*b3', {'a1': K_Rinv[1, 0], 'b1': x,
'a2': K_Rinv[1, 1], 'b2': y,
'a3': K_Rinv[1, 2], 'b3': h})
q = numexpr.evaluate('a1*b1+a2*b2+a3*b3', {'a1': K_Rinv[2, 0], 'b1': x,
'a2': K_Rinv[2, 1], 'b2': y,
'a3': K_Rinv[2, 2], 'b3': h})
return xh,yh,q
@nb.njit(fastmath=True,parallel=True)
def calc_nb(x,y,h,K_Rinv):
xh=np.empty_like(x)
yh=np.empty_like(x)
q=np.empty_like(x)
for i in nb.prange(x.shape[0]):
for j in range(x.shape[1]):
xh[i,j]=K_Rinv[0, 0]*x[i,j]+K_Rinv[0, 1]* y[i,j]+K_Rinv[0, 2]*h[i,j]
yh[i,j]=K_Rinv[1, 0]*x[i,j]+K_Rinv[1, 1]* y[i,j]+K_Rinv[1, 2]*h[i,j]
q[i,j] =K_Rinv[2, 0]*x[i,j]+K_Rinv[2, 1]* y[i,j]+K_Rinv[2, 2]*h[i,j]
return xh,yh,q
x = np.random.random((4206, 5749))
y = np.random.random((4206, 5749))
h = np.random.random((4206, 5749))
K_Rinv = np.random.random((3, 3))
start = datetime.now()
x_calc,y_calc,q_calc = calc(x,y,h,K_Rinv)
end = datetime.now()
print("Calc took: {} ".format(end - start))
start = datetime.now()
x_numexpr,y_numexpr,q_numexpr = calc_numexpr(x,y,h,K_Rinv)
end = datetime.now()
print("Calc_numexpr took: {} ".format(end - start))
start = datetime.now()
x_nb,y_nb,q_nb = calc_nb(x,y,h,K_Rinv)
end = datetime.now()
print("Calc nb took: {} ".format(end - start))
check_nb_q = (q_calc==q_nb).all()
check_nb_y = (y_calc==y_nb).all()
check_nb_x = (x_calc==x_nb).all()
check_numexpr_q = (q_calc==q_numexpr).all()
check_numexpr_y = (y_calc==y_numexpr).all()
check_numexpr_x = (x_calc==x_numexpr).all()
print("Checks for numexpr: {} , {} ,{} \nChecks for nb: {} ,{}, {}" .format(check_numexpr_x,check_numexpr_y,check_numexpr_q,check_nb_x,check_nb_y,check_nb_q))
</code></pre>
<p>其结果如下:</p>
<pre><code>Calc took: 0:00:01.944150
Calc_numexpr took: 0:00:00.616224
Calc nb took: 0:00:01.553058
Checks for numexpr: True , True ,True
Checks for nb: False ,False, False
</code></pre>
<p>因此,numba版本并没有像预期的那样工作。知道我做错了什么吗?我很想让numba解决方案也起作用</p>
<p>注:版本为“0.47.0”</p>