为什么不能用PIL和pytesseract得到字符串？

>>> img1 = remove_noise_and_smooth(r'/tmp/target.jpg') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in remove_noise_and_smooth AttributeError: 'NoneType' object has no attribute 'astype' Thalish Sajeed

Type "help", "copyright", "credits" or "license" for more information. >>> from PIL import Image >>> import pytesseract >>> import matplotlib.pyplot as plt >>> import cv2 >>> import numpy as np >>> >>> >>> def display_image(filename, length_box=60, width_box=30): ... if type(filename) == np.ndarray: ... image = filename ... else: ... image = cv2.imread(filename) ... plt.figure(figsize=(length_box, width_box)) ... plt.imshow(image, cmap="gray") ... >>> >>> filename = r"/tmp/target.jpg" >>> display_image(filename) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 7, in display_image File "/usr/local/lib/python3.5/dist-packages/matplotlib/pyplot.py", line 2699, in imshow None else {}), **kwargs) File "/usr/local/lib/python3.5/dist-packages/matplotlib/__init__.py", line 1810, in inner return func(ax, *args, **kwargs) File "/usr/local/lib/python3.5/dist-packages/matplotlib/axes/_axes.py", line 5494, in imshow im.set_data(X) File "/usr/local/lib/python3.5/dist-packages/matplotlib/image.py", line 634, in set_data raise TypeError("Image data cannot be converted to float") TypeError: Image data cannot be converted to float >>>

>>> import cv2,pytesseract >>> import numpy as np >>> import matplotlib.pyplot as plt >>> >>> >>> def image_smoothening(img): ... ret1, th1 = cv2.threshold(img, 88, 255, cv2.THRESH_BINARY) ... ret2, th2 = cv2.threshold(th1, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU) ... blur = cv2.GaussianBlur(th2, (5, 5), 0) ... ret3, th3 = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU) ... return th3 ... >>> >>> def remove_noise_and_smooth(file_name): ... img = cv2.imread(file_name, 0) ... filtered = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 9, 41) ... kernel = np.ones((1, 1), np.uint8) ... opening = cv2.morphologyEx(filtered, cv2.MORPH_OPEN, kernel) ... closing = cv2.morphologyEx(opening, cv2.MORPH_CLOSE, kernel) ... img = image_smoothening(img) ... or_image = cv2.bitwise_or(img, closing) ... return or_image ... >>> >>> cv2_thresh_list = [cv2.THRESH_BINARY, cv2.THRESH_TRUNC, cv2.THRESH_TOZERO] >>> fn = r'/tmp/target.jpg' >>> img1 = remove_noise_and_smooth(fn) >>> img2 = cv2.imread(fn, 0) >>> for i, img in enumerate([img1, img2]): ... img_type = {0: 'Preprocessed Images\n', ... 1: '\nUnprocessed Images\n'} ... print(img_type[i]) ... for item in cv2_thresh_list: ... print('Thresh: {}'.format(str(item))) ... _, thresh = cv2.threshold(img, 127, 255, item) ... plt.imshow(thresh, 'gray') ... f_name = '{0}.jpg'.format(str(item)) ... plt.savefig(f_name) ... print('OCR Result: {}\n'.format(pytesseract.image_to_string(f_name)))

Thresh: 0 <matplotlib.image.AxesImage object at 0x7fbc2519a6d8> OCR Result: 10 15 20 Edﬁﬁ 10 2 o 30 40 so so Thresh: 2 <matplotlib.image.AxesImage object at 0x7fbc255e7eb8> OCR Result: 10 15 20 Edﬁﬁ 10 2 o 30 40 so so Thresh: 3 <matplotlib.image.AxesImage object at 0x7fbc25452fd0> OCR Result: 10 15 20 Edﬁﬁ 10 2 o 30 40 so so Unprocessed Images Thresh: 0 <matplotlib.image.AxesImage object at 0x7fbc25464c88> OCR Result: 10 15 20 Thresh: 2 <matplotlib.image.AxesImage object at 0x7fbc254520f0> OCR Result: 10 15 2o 2o 30 40 50 Thresh: 3 <matplotlib.image.AxesImage object at 0x7fbc1e1968d0> OCR Result: 10 15 20

2条回答

网友

1楼 · 编辑于 2024-04-26 07:15:17

首先：确保您已经安装了Tesseract program（不仅仅是python包）

Jupyter Notebook of Solution：只有通过remove_noise_and_smooth的图像才能用OCR成功翻译。

尝试转换时图像.gif生成TypeError: int() argument must be a string, a bytes-like object or a number, not 'tuple'。在

重命名图像.gif到文件段，生成TypeError

打开图像.gif和“另存为”文件段，输出为空，表示无法识别文本。在

from PIL import Image
import pytesseract

# If you don't have tesseract executable in your PATH, include the following:
# your path may be different than mine
pytesseract.pytesseract.tesseract_cmd = "C:/Program Files (x86)/Tesseract-OCR/tesseract.exe"

imgo = Image.open('0244R_clean.jpg')

print(pytesseract.image_to_string(imgo))

无法从原始图像中识别出文本，因此可能需要进行后处理才能在OCR之前进行清理
我创建了一个干净的图像，pytesseract从中毫无问题地提取文本。图像包含在下面，因此您可以使用自己的代码对其进行测试，以验证其功能。在

添加后期处理

Improve Accuracy of OCR using Image Preprocessing

OpenCV

^{pr2}$

img1将生成以下新图像：

img2将生成这些新图像：

网友

2楼 · 编辑于 2024-04-26 07:15:17

让我们从JPG图像开始，因为pytesseract在处理GIF图像格式时存在问题。reference

filename = "/tmp/target.jpg"
image = cv2.imread(filename)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
ret, threshold = cv2.threshold(gray,55, 255, cv2.THRESH_BINARY)
print(pytesseract.image_to_string(threshold))

让我们试着把这些问题分解一下。在

您的图像噪声太大，使tesseract引擎无法识别字母，我们使用一些简单的图像处理技术，如灰度缩放和阈值分割来去除图像中的一些噪声。在

然后，当我们把它发送到OCR引擎时，我们会看到字母被捕捉得更准确。在

如果你按照这个github link，你可以在我的笔记本上找到我测试这个的地方

编辑- 我已经用一些额外的图像清理技术更新了笔记本。源图像噪声太大，无法直接在图像上使用tesseract。你需要使用图像清理技术。在

你可以改变阈值参数或换掉高斯模糊的其他技术，直到你得到你想要的结果。在

如果您希望在嘈杂的图像上运行OCR，请查看商业OCR提供商，如google-cloud-vision。他们每月免费提供1000个OCR电话。在

添加后期处理

相关问题更多 >

编程相关推荐

热门问题

热门文章