Python多进程.pool将从父进程派生大内存

2024-05-15 06:03:16 发布

您现在位置:Python中文网/ 问答频道 /正文

我使用的是multiprocess.pool,我也在使用参数maxtasksperchild

现在的问题是,当maxtasksperchild到达时,进程池将终止原始子进程,并创建一个新的子进程。但是子进程似乎是由fork()创建的,它将从父进程复制大量内存。

对我来说,我希望父进程有很多内存,但我不希望子进程有太多内存(它甚至不会使用内存),而且我还希望使用multiprocess.pool的能力。

有人能帮忙吗?

演示代码(这将花费3.6G内存):

# -*- coding: utf-8 -*-
import multiprocessing
import time
from setproctitle import setproctitle

cache_list = []


def init_child_process():
    # set the name of the child process
    setproctitle('child process')


def child_task():
    time.sleep(1)


def test():
    # make large memory for parent process
    # attention, each process will cost 600M, total is 3.6G
    for i in xrange(500000):
        import random
        size = random.randint(10, 20)
        data = {
            u"key_1": u"value_1",
            u"key_2": u"value_2",
            u"key_3": u"value_3",
            u"key_4": u"long_value " * size,
            u"key_5": u"long_value " * size
        }
        doc_list = [data] * 10
        cache_list.append(doc_list)

    # create child process pool
    pool = multiprocessing.Pool(5, initializer=init_child_process,
                                initargs=())

    # add tasks, make the pool to spawn the child processes
    for i in xrange(10):
        pool.apply_async(child_task)

    # if you add a breakpoint here(or use pdb)
    # you will find both parent process and child processed have large memory
    # maybe it is due to fork()
    # but I want the child process not fork from parent
    import pdb
    pdb.set_trace()


if __name__ == '__main__':
    test()

Tags: thekey内存importchildfor进程value

热门问题