复制TesserCap的斩波滤波器去除captcha图像的背景噪声

2条回答

网友

1楼 · 编辑于 2024-04-26 10:05:30

算法基本上检查一行中是否有多个目标像素（在本例中是非白色像素），如果像素数小于或等于chop因子，则更改这些像素。

例如，在像素的样本行中，其中#为黑色，-为白色，应用2的chop因子将--#--###-##---#####---#-#转换为------###-------#####-------。这是因为存在小于或等于2像素的黑色像素序列，并且这些序列被替换为白色。大于2像素的连续序列保持不变。

这是在我的Python代码（如下）中实现的chop算法在您文章的原始图像上的结果：

'Chopped' image

为了将此应用于整个图像，只需对每一行和每一列执行此算法。下面是实现这一点的Python代码：

import PIL.Image
import sys

# python chop.py [chop-factor] [in-file] [out-file]

chop = int(sys.argv[1])
image = PIL.Image.open(sys.argv[2]).convert('1')
width, height = image.size
data = image.load()

# Iterate through the rows.
for y in range(height):
    for x in range(width):

        # Make sure we're on a dark pixel.
        if data[x, y] > 128:
            continue

        # Keep a total of non-white contiguous pixels.
        total = 0

        # Check a sequence ranging from x to image.width.
        for c in range(x, width):

            # If the pixel is dark, add it to the total.
            if data[c, y] < 128:
                total += 1

            # If the pixel is light, stop the sequence.
            else:
                break

        # If the total is less than the chop, replace everything with white.
        if total <= chop:
            for c in range(total):
                data[x + c, y] = 255

        # Skip this sequence we just altered.
        x += total


# Iterate through the columns.
for x in range(width):
    for y in range(height):

        # Make sure we're on a dark pixel.
        if data[x, y] > 128:
            continue

        # Keep a total of non-white contiguous pixels.
        total = 0

        # Check a sequence ranging from y to image.height.
        for c in range(y, height):

            # If the pixel is dark, add it to the total.
            if data[x, c] < 128:
                total += 1

            # If the pixel is light, stop the sequence.
            else:
                break

        # If the total is less than the chop, replace everything with white.
        if total <= chop:
            for c in range(total):
                data[x, y + c] = 255

        # Skip this sequence we just altered.
        y += total

image.save(sys.argv[3])

网友

2楼 · 编辑于 2024-04-26 10:05:30

尝试以下操作（伪代码）：

for each row of pixels:
    if there is a group of about 3 or more pixels in a row, leave them
    else remove the pixels

然后对列重复同样的操作。看起来至少有点管用。这样水平和垂直移动也会删除水平/垂直线。

相关问题更多 >

编程相关推荐

热门问题

热门文章

复制TesserCap的斩波滤波器去除captcha图像的背景噪声

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >