使用Python快速确定图像是否（模糊地）在集合中

import PIL seen_images = {} # This would really be a shelf or something # From http://www.guguncube.com/1656/python-image-similarity-comparison-using-several-techniques def image_pixel_hash_code(image): pixels = list(image.getdata()) avg = sum(pixels) / len(pixels) bits = "".join(map(lambda pixel: '1' if pixel < avg else '0', pixels)) # '00010100...' hexadecimal = int(bits, 2).__format__('016x').upper() return hexadecimal def process_image(filepath): thumb = PIL.Image.open(filepath).resize((128,128)).convert("L") code = image_pixel_hash_code(thumb) previous_image = seen_images.get(code, None) if code in seen_images: print "'{}' already seen as '{}'".format(filepath, previous_image) else: seen_images[code] = filepath

import os for root, dirs, files in os.walk(IMAGE_ROOT): for filename in files: filepath = os.path.join(root, filename) try: process_image(filepath) except IOError: pass

1条回答

网友

1楼 · 发布于 2024-06-17 13:43:01

有很多方法可以比较图像，但是对于给定的示例，我怀疑简单性和速度是关键因素（因此您尝试使用哈希作为第一个过程）。这里有一些建议-在所有情况下，我建议缩小和裁剪图像到一个正常的大小和形状。你知道吗

在收缩之前平滑图像（高斯模糊），以尽量减少人工制品的影响。然后应用哈希或其他比较。你知道吗
将图像彼此相减（RGB）并检查余数。相同的图像将返回零，压缩伪影将导致微小的变化。可以对值进行阈值、求和或平均，并与截止值进行比较。你知道吗
使用标准距离算法（参见scipy.spatial.distance）计算两幅图像之间的“距离”。例如，euclidean距离将有效地给出与减法之和相同的结果，而cosine将忽略强度，但与图像上的变化轮廓相匹配，即相同图像的较暗版本将被视为等效。为此，你需要将你的图像展平到1D阵列。你知道吗

最后两种方法需要在上传时将每个图像与其他图像进行比较，对于大量的图像来说，这在计算上会非常昂贵。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章