<p>你误解了<a href="https://en.wikipedia.org/wiki/Corner_detection#The_Harris_&_Stephens_/_Shi%E2%80%93Tomasi_corner_detection_algorithms" rel="nofollow noreferrer">Shi-Tomasi method</a>。您正在计算两个导数<code>dx</code>和<code>dy</code>,对它们进行局部平均(总和与局部平均值不同,我们可以忽略一个常数因子),然后取最小值。Shi-Tomasi方程引用<a href="https://en.wikipedia.org/wiki/Structure_tensor" rel="nofollow noreferrer">Structure Tensor</a>,它使用该矩阵的两个特征值中的最低值</p>
<p>结构张量是梯度的外积与自身形成的矩阵,然后平滑:</p>
<pre><code>[ smooth(dx*dx) smooth(dx*dy) ]
[ smooth(dx*dy) smooth(dy*dy) ]
</code></pre>
<p>也就是说,我们取x-导数{<cd1>}和y-导数{<cd2>},形成三个图像{<cd5>}、{<cd6>}和{<cd7>},并平滑这三个图像。现在对于每个像素,我们有三个值,它们一起构成一个对称矩阵。这叫做结构张量</p>
<p>这个结构张量的特征值说明了一些关于局部边的情况。如果两者都很小,则附近没有边。如果一个较大,则在局部邻域中有一个单一的边缘方向。如果两者都很大,那么会有更复杂的事情发生,很可能是一个角落。平滑窗口越大,我们正在检查的局部邻域就越大。选择与我们正在查看的结构大小相匹配的邻域大小非常重要</p>
<p>结构张量的特征向量表示局部结构的方向。如果有一条边(一个特征值较大),则相应的特征向量将是该边的法线</p>
<p>Shi Tomasi使用两个特征值中最小的一个。如果两个特征值中的最小值较大,则存在比局部邻域中的边更复杂的情况</p>
<p>Harris角点检测器也使用结构张量,但它结合行列式和轨迹,以较少的计算成本获得类似的结果。Shi Tomasi更好,但计算成本更高,因为特征值计算需要计算平方根。Harris检测器近似于Shi-Tomasi检测器</p>
<p>下面是Shi Tomasi(上图)和Harris(下图)的比较。我将两者的最大值都削减到一半,因为最大值出现在文本区域,这让我们更好地看到对相关角的较弱响应。如您所见,Shi Tomasi对图像中的所有角落都有更一致的响应</p>
<p><a href="https://i.stack.imgur.com/UGgQJ.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/UGgQJ.png" alt="Shi-Tomasi result"/></a></p>
<p><a href="https://i.stack.imgur.com/KTLa2.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/KTLa2.png" alt="Harris result"/></a></p>
<p>对于这两种情况,我使用了一个sigma=2的高斯窗口进行局部平均(使用3 sigma的截止值,可以得到一个13x13的平均窗口)</p>
<hr/>
<p>查看您的更新代码,我发现了几个问题。我在这里用注释对这些进行了注释:</p>
<pre class="lang-py prettyprint-override"><code>def st(image, w_size):
v = []
dy, dx = sy(image), sx(image)
dy = dy**2
dx = dx**2
dxdy = dx*dy
# Here you have dxdy=dx**2 * dy**2, because dx and dy were changed
# in the lines above.
dx = cv2.GaussianBlur(dx, (3,3), cv2.BORDER_DEFAULT)
dy = cv2.GaussianBlur(dy, (3,3), cv2.BORDER_DEFAULT)
dxdy = cv2.GaussianBlur(dxdy, (3,3), cv2.BORDER_DEFAULT)
# Gaussian blur size should be indicated with the sigma of the Gaussian,
# not with the size of the kernel. A 3x3 kernel corresponds, in OpenCV,
# to a Gaussian with sigma = 0.8, which is way too small. Use sigma=2.
ofset = int(w_size/2)
for y in range(ofset, image.shape[0]-ofset):
for x in range(ofset, image.shape[1]-ofset):
s_y = y - ofset
e_y = y + ofset + 1
s_x = x - ofset
e_x = x + ofset + 1
w_Ixx = dx[s_y: e_y, s_x: e_x]
w_Iyy = dy[s_y: e_y, s_x: e_x]
w_Ixy = dxdy[s_y: e_y, s_x: e_x]
sum_xx = w_Ixx.sum()
sum_yy = w_Iyy.sum()
sum_xy = w_Ixy.sum()
# We've already done the local averaging using GaussianBlur,
# this summing is now no longer necessary.
m = np.matrix([[sum_xx, sum_xy],
[sum_xy, sum_yy]])
eg = np.linalg.eigvals(m)
v.append((min(eg[0], eg[1]), y, x))
return v
def sy(img):
t = cv2.Sobel(img,cv2.CV_8U,0,1,ksize=3)
# The output of Sobel has positive and negative values. By writing it
# into a 8-bit unsigned integer array, you lose all these negative
# values, they become 0. This is half your edges that you lose!
return t
def sx(img):
t = cv2.Sobel(img,cv2.CV_8U,1,0,ksize=3)
return t
</code></pre>
<p>我就是这样修改您的代码的:</p>
<pre class="lang-py prettyprint-override"><code>import cv2
import numpy as np
def st(image):
dy, dx = sy(image), sx(image)
dxdx = cv2.GaussianBlur(dx**2, ksize = None, sigmaX=2)
dydy = cv2.GaussianBlur(dy**2, ksize = None, sigmaX=2)
dxdy = cv2.GaussianBlur(dx*dy, ksize = None, sigmaX=2)
for y in range(image.shape[0]):
for x in range(image.shape[1]):
m = np.matrix([[dxdx[y,x], dxdy[y,x]],
[dxdy[y,x], dydy[y,x]]])
eg = np.linalg.eigvals(m)
image[y,x] = min(eg[0], eg[1]) # Write into the input image.
# Better would be to create a new
# array as output. Make sure it is
# a floating-point type!
def sy(img):
t = cv2.Sobel(img,cv2.CV_32F,0,1,ksize=3)
return t
def sx(img):
t = cv2.Sobel(img,cv2.CV_32F,1,0,ksize=3)
return t
image = cv2.imread('fu4r5.png', 0)
output = image.astype(np.float32) # I'm writing the result of the detector in here
st(output)
pp.imshow(output); pp.show()
</code></pre>