图像块的多处理

2条回答

网友

1楼 · 编辑于 2024-04-16 09:15:31

我会这样做，从依赖关系开始：

from multiprocessing import Pool
import numpy as np
from PIL import Image

# and some for testing
from random import random
from time import sleep

首先，我定义了一个将图像分成“块”的函数，就像您所说的那样：

^{pr2}$

它是一个懒惰的迭代器，所以这可以持续一段时间。在

然后定义我的worker函数：

def dumb_func(cc):
    (y0,y1), (x0,x1) = cc
    # convert to floats for ease of processing
    chunk = image[y0:y1,x0:x1] / 255.
    # random slow down for testing
    # sleep(random() ** 6)
    res = chunk ** 2
    # convert back to bytes for efficiency
    return cc, (res * 255).astype(np.uint8)

为了提高效率，我确保源数组尽可能地接近原始格式，并以相同的格式发送回去（如果显然要处理其他像素格式，这可能需要一些修改）。在

然后我把它放在一起：

if __name__ == '__main__':
    source = Image.open('tmp.jpeg')
    image = np.asarray(source)
    print("loaded", image.shape, image.dtype)

    with Pool() as pool:
        resit = pool.imap_unordered(
            dumb_func, chunkit(*image.shape[:2]))

        output = np.empty_like(image)
        for cc, res in resit:
            (y0,y1), (x0,x1) = cc
            output[y0:y1,x0:x1] = res

    im = Image.fromarray(output, 'RGB')
    im.save('out.jpeg')

这将在几秒钟内处理一个15像素的图像，其中大部分都花在加载/保存图像上。它可能在数组步进和缓存友好性方面更聪明，但希望能有所帮助！在

注意：我认为这段代码依赖于cpythonunix风格的进程分叉语义，以确保在进程之间有效地共享映像。不知道如果你把它放在别的东西上会发生什么

网友

2楼 · 编辑于 2024-04-16 09:15:31

我一直在为基本相同的事情编写代码。现在的目标只是用透明的像素替换白色像素，但是它似乎替换了整个图像，所以在某个地方出现了一个bug……但是它在multiprocessing模块中不再出现错误，所以也许它可以作为一个示例，说明如何加载Queue，然后让你的工作进程处理它！在

from PIL import Image
from multiprocessing import Process, JoinableQueue
from threading import Thread
from time import time

def worker_function(q, new_data):
    while True:
        # print("Items in queue: {}".format(q.qsize()))
        index, pixel = q.get()
        if pixel[0] > 240 and pixel[1] > 240 and pixel[2] > 240:
            out_pixel = (0, 0, 0, 0)
        else:
            out_pixel = pixel
        new_data[index] = out_pixel
        q.task_done()

if __name__ == "__main__":
    start = time()
    q = JoinableQueue()

    my_image = Image.open('InputImage.jpg')
    my_image = my_image.convert('RGBA')
    datas = list(my_image.getdata())
    new_data = [0] * len(datas) # make a blank array the size of our image to fill later

    print('putting image into queue')
    for count, item in enumerate(datas):
        q.put((count, item))

    print('starting workers')
    worker_count = 50
    processes = []
    for i in range(worker_count):
        p = Process(target=worker_function, args=[q, new_data])
        p.daemon = True
        p.start()
    print('main thread waiting')
    q.join()
    my_image.putdata(new_data)
    my_image.save('output.png', "PNG")

    end = time()
    print('{:.3f} seconds elapsed'.format(end - start))

我认为在if __name__ == "__main__"块中“保护”代码是很重要的，否则派生的进程似乎在运行它。在

更新

看起来你需要实现一个Manager()（或者可能还有其他我不知道的方法！）。我把代码改成：

^{pr2}$

虽然这似乎不是最快的选择！我相信还有其他方法可以提高速度。我对Threads执行相同操作的代码看起来非常相似：

from PIL import Image
from threading import Thread
from queue import Queue
import time

start = time.time()
q = Queue()

planeIm = Image.open('InputImage.jpg')
planeIm = planeIm.convert('RGBA')
datas = planeIm.getdata()
new_data = [0] * len(datas)

print('putting image into queue')
for count, item in enumerate(datas):
    q.put((count, item))

def worker_function():
    while True:
        # print("Items in queue: {}".format(q.qsize()))
        index, pixel = q.get()
        if pixel[0] > 240 and pixel[1] > 240 and pixel[2] > 240:
            out_pixel = (0, 0, 0, 0)
        else:
            out_pixel = pixel
        new_data[index] = out_pixel
        q.task_done()

print('starting workers')
worker_count = 100
for i in range(worker_count):
    t = Thread(target=worker_function)
    t.daemon = True
    t.start()
print('main thread waiting')
q.join()
print('Queue has been joined')
planeIm.putdata(new_data)
planeIm.save('output.png', "PNG")

end = time.time()

elapsed = end - start
print('{:3.3} seconds elapsed'.format(elapsed))

但是，处理我的图像需要大约23秒的线程和大约170秒的多处理！！我怀疑这可能是因为启动Process对象所需的更大开销，而且我处理每个像素的算法目前都很简单（只是if pixel[0] > 240 and pixel[1] > 240 and pixel[2] > 240:位），所以我很可能无法获得复杂像素处理算法所能带来的速度提升。还要注意multiprocessing documentation

a single manager can be shared by processes on different computers over a network. They are, however, slower than using shared memory.

这让我相信有更快的选择。在

更新

相关问题更多 >

编程相关推荐

热门问题

热门文章