用pythonpil去除Captcha图像中的背景噪声线

import PIL.Image import sys # python chop.py [chop-factor] [in-file] [out-file] chop = int(sys.argv[1]) image = PIL.Image.open(sys.argv[2]).convert('1') width, height = image.size data = image.load() # Iterate through the rows. for y in range(height): for x in range(width): # Make sure we're on a dark pixel. if data[x, y] > 128: continue # Keep a total of non-white contiguous pixels. total = 0 # Check a sequence ranging from x to image.width. for c in range(x, width): # If the pixel is dark, add it to the total. if data[c, y] < 128: total += 1 # If the pixel is light, stop the sequence. else: break # If the total is less than the chop, replace everything with white. if total <= chop: for c in range(total): data[x + c, y] = 255 # Skip this sequence we just altered. x += total # Iterate through the columns. for x in range(width): for y in range(height): # Make sure we're on a dark pixel. if data[x, y] > 128: continue # Keep a total of non-white contiguous pixels. total = 0 # Check a sequence ranging from y to image.height. for c in range(y, height): # If the pixel is dark, add it to the total. if data[x, c] < 128: total += 1 # If the pixel is light, stop the sequence. else: break # If the total is less than the chop, replace everything with white. if total <= chop: for c in range(total): data[x, y + c] = 255 # Skip this sequence we just altered. y += total image.save(sys.argv[3])

3条回答

网友

1楼 · 编辑于 2024-06-10 18:34:15

要快速去除大部分线条，可以将所有黑色像素与相邻的两个或更少的黑色像素转换为白色。那就可以解决那些杂散的线路。然后，当你有很多“积木”的时候，你就可以去掉那些较小的积木。

这是假设样本图像已经放大，并且线条只有一个像素宽。

网友

2楼 · 编辑于 2024-06-10 18:34:15

你可以使用你自己的扩张和侵蚀功能，wich将删除最小的线条。可以找到一个很好的实现here。

网友

3楼 · 编辑于 2024-06-10 18:34:15

我个人使用的扩张和侵蚀如上所述，但结合一些基本的统计宽度和高度，试图找到离群值，并消除这些线需要。在这之后，一个过滤器应该可以工作，它取一个核的最小值，并在使用临时图像作为原始图像之前，在临时图像中改变颜色的中心像素（向下迭代旧图像）。在枕头/PIL中，基于最小值的任务是通过img.filter（ImageFilter.MINFILTER）完成的。

如果这还不够，它应该生成一个可识别的集合，OpenCV的轮廓和最小边界旋转框可以用来旋转一个字母进行比较（此时我建议使用Tesseract或商业OCR，因为它们有大量的字体和额外的功能，如聚类和清理）。

相关问题更多 >

编程相关推荐

热门问题

热门文章