<p>从输入图像中,您需要将文本设置为白色,将背景设置为黑色</p>
<p><a href="https://i.stack.imgur.com/DmbZx.png" rel="noreferrer"><img src="https://i.stack.imgur.com/DmbZx.png" alt="enter image description here"/></a></p>
<p>然后需要计算账单的旋转角度。一个简单的方法是找到所有白点(<code>findNonZero</code>)的<code>minAreaRect</code>,您将得到:</p>
<p><a href="https://i.stack.imgur.com/Y1eTU.png" rel="noreferrer"><img src="https://i.stack.imgur.com/Y1eTU.png" alt="enter image description here"/></a></p>
<p>然后可以旋转帐单,使文字水平:</p>
<p><a href="https://i.stack.imgur.com/Ky0jX.png" rel="noreferrer"><img src="https://i.stack.imgur.com/Ky0jX.png" alt="enter image description here"/></a></p>
<p>现在可以计算水平投影(<code>reduce</code>)。你可以取每行的平均值。在直方图上应用阈值<code>th</code>来解释图像中的一些噪声(这里我使用了<code>0</code>,即没有噪声)。只有背景的行有一个值<code>>0</code>,文本行在直方图中有一个值<code>0</code>。然后取直方图中每个连续白仓序列的平均仓坐标。这将是直线的<code>y</code>坐标:</p>
<p><a href="https://i.stack.imgur.com/C7Z2h.png" rel="noreferrer"><img src="https://i.stack.imgur.com/C7Z2h.png" alt="enter image description here"/></a></p>
<p>这是密码。它是C++的,但是由于大部分工作都是用OpenCV函数,所以它很容易转换为Python。至少,您可以将其用作参考:</p>
<pre><code>#include <opencv2/opencv.hpp>
using namespace cv;
using namespace std;
int main()
{
// Read image
Mat3b img = imread("path_to_image");
// Binarize image. Text is white, background is black
Mat1b bin;
cvtColor(img, bin, COLOR_BGR2GRAY);
bin = bin < 200;
// Find all white pixels
vector<Point> pts;
findNonZero(bin, pts);
// Get rotated rect of white pixels
RotatedRect box = minAreaRect(pts);
if (box.size.width > box.size.height)
{
swap(box.size.width, box.size.height);
box.angle += 90.f;
}
Point2f vertices[4];
box.points(vertices);
for (int i = 0; i < 4; ++i)
{
line(img, vertices[i], vertices[(i + 1) % 4], Scalar(0, 255, 0));
}
// Rotate the image according to the found angle
Mat1b rotated;
Mat M = getRotationMatrix2D(box.center, box.angle, 1.0);
warpAffine(bin, rotated, M, bin.size());
// Compute horizontal projections
Mat1f horProj;
reduce(rotated, horProj, 1, CV_REDUCE_AVG);
// Remove noise in histogram. White bins identify space lines, black bins identify text lines
float th = 0;
Mat1b hist = horProj <= th;
// Get mean coordinate of white white pixels groups
vector<int> ycoords;
int y = 0;
int count = 0;
bool isSpace = false;
for (int i = 0; i < rotated.rows; ++i)
{
if (!isSpace)
{
if (hist(i))
{
isSpace = true;
count = 1;
y = i;
}
}
else
{
if (!hist(i))
{
isSpace = false;
ycoords.push_back(y / count);
}
else
{
y += i;
count++;
}
}
}
// Draw line as final result
Mat3b result;
cvtColor(rotated, result, COLOR_GRAY2BGR);
for (int i = 0; i < ycoords.size(); ++i)
{
line(result, Point(0, ycoords[i]), Point(result.cols, ycoords[i]), Scalar(0, 255, 0));
}
return 0;
}
</code></pre>