检测并裁剪.pdf或图像中的方框作为单个图像

from PyPDF2 import PdfFileReader, PdfFileWriter reader = PdfFileReader('data/samples.pdf', 'r') # getting the first page page = reader.getPage(0) writer = PdfFileWriter() # Loop through all pages in pdf object to crop based on (x,y) coordinates for i in range(reader.getNumPages()): page = reader.getPage(i) page.cropBox.setLowerLeft((42,115)) page.cropBox.setUpperRight((500, 245)) writer.addPage(page) outstream = open('samples_cropped.pdf','wb') writer.write(outstream) outstream.close()

1条回答

网友

1楼 · 发布于 2024-04-26 04:51:37

下面是一个使用OpenCV的简单方法

将图像转换为灰度和高斯模糊
阈值图像
查找等高线
迭代轮廓并使用轮廓区域进行过滤
提取ROI

提取ROI后，可以将每个ROI保存为单独的图像，然后使用pytesseract或其他工具执行OCR文本提取。在

结果

你提到这个

The boundaries/coordinates of the handwriting boxes wont always be the same for each page in the pdf.

目前，使用(x,y)坐标的方法不是很可靠，因为方框可能位于图像的任何位置。一种更好的方法是使用最小阈值轮廓区域进行滤波来检测盒。根据要检测的框的大小，可以调整变量。如果您想要额外的筛选以防止误报，您可以将另一个筛选机制添加到aspect ratio中。例如，计算每个轮廓的宽高比，然后如果它在范围内（例如0.8到{}），那么它就是一个有效的框。在

import cv2

image = cv2.imread('1.jpg')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (3, 3), 0)
thresh = cv2.threshold(blurred, 230,255,cv2.THRESH_BINARY_INV)[1]

# Find contours
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

# Iterate thorugh contours and filter for ROI
image_number = 0
min_area = 10000
for c in cnts:
    area = cv2.contourArea(c)
    if area > min_area:
        x,y,w,h = cv2.boundingRect(c)
        cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
        ROI = original[y:y+h, x:x+w]
        cv2.imwrite("ROI_{}.png".format(image_number), ROI)
        image_number += 1

cv2.imshow('image', image)
cv2.waitKey(0)

相关问题更多 >

编程相关推荐

热门问题

热门文章