如果图像包含特定颜色,则裁剪图像的线条

2024-05-11 03:21:17 发布

您现在位置:Python中文网/ 问答频道 /正文

我要裁剪包含特定颜色的图像的线条。你知道吗

我已经有以下代码行来获得特定的颜色。它是铅笔笔画图像中包含的颜色。你知道吗

# we get the dominant colors
img = cv2.imread('stroke.png')
height, width, dim = img.shape
# We take only the center of the image
img = img[int(height/4):int(3*height/4), int(width/4):int(3*width/4), :]
height, width, dim = img.shape

img_vec = np.reshape(img, [height * width, dim] )

kmeans = KMeans(n_clusters=3)
kmeans.fit( img_vec )

#  count cluster pixels, order clusters by cluster size
unique_l, counts_l = np.unique(kmeans.labels_, return_counts=True)
sort_ix = np.argsort(counts_l)
sort_ix = sort_ix[::-1]

fig = plt.figure()
ax = fig.add_subplot(111)
x_from = 0.05

# colors are cluster_center in kmeans.cluster_centers_[sort_ix] I think

然后我要分析我所有的图像行,并裁剪出边缘有连续铅笔笔划的行。也就是说,至少有一个像素具有示例stroke.png的一种颜色的行,白色被排除在外(我还没有实现)。最后从这些行中提取文本。你知道吗

### Attempt to get the colors of the stroke example
# we get the dominant colors
img = cv2.imread('strike.png')
height, width, dim = img.shape
# We take only the center of the image
img = img[int(height/4):int(3*height/4), int(width/4):int(3*width/4), :]
height, width, dim = img.shape

img_vec = np.reshape(img, [height * width, dim] )

kmeans = KMeans(n_clusters=2)
kmeans.fit( img_vec )

#  count cluster pixels, order clusters by cluster size
unique_l, counts_l = np.unique(kmeans.labels_, return_counts=True)
sort_ix = np.argsort(counts_l)
sort_ix = sort_ix[::-1]

fig = plt.figure()
ax = fig.add_subplot(111)
x_from = 0.05

cluster_center = kmeans.cluster_centers_[sort_ix][1]

# plt.show()
### End of attempt

for file_name in file_names:
    print("we wrote : ",file_name)
    # load the image and convert it to grayscale
    image = cv2.imread(file_name)
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    # check to see if we should apply thresholding to preprocess the
    # image
    if args["preprocess"] == "thresh":
        gray = cv2.threshold(gray, 0, 255,
            cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]

    # make a check to see if median blurring should be done to remove
    # noise
    elif args["preprocess"] == "blur":
        gray = cv2.medianBlur(gray, 3)

    # write the grayscale image to disk as a temporary file so we can
    # apply OCR to it
    filename = "{}.png".format(os.getpid())
    cv2.imwrite(filename, gray)

    # Here we should split the images in parts. Those who have strokes
    # We asked for a stroke example so we have its color 
    # While we find pixels with the same color we store its line
    im = Image.open(filename)
    (width, height)= im.size
    for x in range(width): 
        for y in range(height):
            rgb_im = im.convert('RGB')
            red, green, blue = rgb_im.getpixel((1, 1))
            # We test if the pixel has the same color as the second cluster # We should rather test if it is "alike"
            # It means that we found a line were there is some paper stroke
            if np.array_equal([red,green,blue],cluster_center): 
                # if it is the case we store the width as starting point while we find pixels 
                # and we break the loop to go to another line
                if start == -1:
                    start = x
                    selecting_area = True
                    break
                # if it already started we break the loop to go to another line
                if selecting_area == True:
                    break
            # if no pixel in a line had the same color as the second cluster but selecting already started
            # we crop the image and go to another line
            # it means that there is no more paper stroke
            if selecting_area == True:
                text_box = (0, start, width, x)
                # Crop Image
                area = im.crop(text_box)
                area.show() 
                selecting_area = False
                break


    # load the image as a PIL/Pillow image, apply OCR, and then delete
    # the temporary file
    text = pytesseract.image_to_string(Image.open(filename))
    os.remove(filename)
    #print(text)

    with open('resume.txt', 'a+') as f:
        print('***:', text, file=f)  

因此,到目前为止,如果我能够得到我想要用来裁剪图像的颜色,那么我设计的测试知道图像的哪个部分实际上必须被裁剪,似乎没有结束你能帮我实现吗?你知道吗

附件

  • this paper开发的另一个想法是将笔划分组并分别识别文本,但我还不知道任何分组算法可以帮助我完成这项工作。

  • 要处理的图像示例:

enter image description here

  • 铅笔笔划示例:

enter image description here

完整的项目,一个注释文本摘要器,可以在Githubhere上找到。你知道吗


Tags: thetoimageimgifnpsortwidth