Python多进程函数返回多个输出

1 投票

2 回答

1526 浏览

提问于 2025-04-17 14:02

我正在尝试使用多进程来返回一个列表，但我发现并不是等所有进程都完成后再返回结果，而是在一个叫做mp_factorizer的地方，出现了多个返回值，像这样：

None
None
(returns list)

在这个例子中，我用了2个线程。如果我用了5个线程，那么在最终列表出来之前，会有5个None的返回。下面是代码：

def mp_factorizer(nums, nprocs, objecttouse):
    if __name__ == '__main__':
        out_q = multiprocessing.Queue()
        chunksize = int(math.ceil(len(nums) / float(nprocs)))
        procs = []
        for i in range(nprocs):
            p = multiprocessing.Process(
                    target=worker,                   
                    args=(nums[chunksize * i:chunksize * (i + 1)],
                          out_q,
                    objecttouse))
            procs.append(p)
            p.start()

        # Collect all results into a single result dict. We know how many dicts
        # with results to expect.
        resultlist = []
        for i in range(nprocs):
            temp=out_q.get()
            index =0
            for i in temp:
                resultlist.append(temp[index][0][0:])
                index +=1

        # Wait for all worker processes to finish
        for p in procs:
            p.join()
            resultlist2 = [x for x in resultlist if x != []]
        return resultlist2

def worker(nums, out_q, objecttouse):
    """ The worker function, invoked in a process. 'nums' is a
        list of numbers to factor. The results are placed in
        a dictionary that's pushed to a queue.
    """
    outlist = []
    for n in nums:        
        outputlist=objecttouse.getevents(n)
        if outputlist:
            outlist.append(outputlist)   
    out_q.put(outlist)

mp_factorizer接收一个项目列表、线程数量和一个工作对象，然后把项目列表分成若干份，让所有线程都能平均分配到列表中的项目，接着启动工作线程。工作线程使用这个对象对给定的列表进行计算，并把结果添加到一个队列中。mp_factorizer应该从队列中收集所有结果，把它们合并成一个大列表，然后返回这个列表。然而，我却得到了多个返回值。

我哪里做错了？还是说这是因为Windows处理多进程的方式比较奇怪，所以才会这样？（Python 2.7.3，Windows7 64位）

编辑：问题出在if __name__ == '__main__':的位置不对。我在处理另一个问题时发现了这个，详细解释可以参考在子进程中使用多进程。

windows 线程子进程多进程并行处理队列计算结果合并

2 个回答

你的 if __name__ == '__main__' 语句放错地方了。应该把它放在 print 语句的周围，这样可以防止子进程执行那一行代码：

if __name__ == '__main__':
    print mp_factorizer(list, 2, someobject)

现在你的 if 放在了 mp_factorizer 函数里面，这样当在子进程中调用这个函数时，它会返回 None。

回答于 2025-04-17 由 Python大师

分享举报

if __name__ == '__main__' 这个语句放错地方了。一个简单的解决办法是像Janne Karila建议的那样，只保护对mp_factorizer的调用：

if __name__ == '__main__':
    print mp_factorizer(list, 2, someobject)

不过，在Windows系统上，主文件会在执行时运行一次，然后每个工作线程再运行一次，这样总共会执行3次主线程，除了被保护的代码部分。

这可能会导致问题，尤其是当主线程里还有其他计算时，至少会让性能变得更慢。虽然应该只执行工作函数多次，但在Windows上，所有没有被if __name__ == '__main__'保护的代码都会被执行。

所以解决方案是保护整个主进程，确保所有代码都在if __name__ == '__main__'之后执行。

如果工作函数在同一个文件里，就需要把它从这个if语句中排除，因为否则它不能被多次调用来进行多进程处理。

伪代码主线程：

# Import stuff
if __name__ == '__main__':
    #execute whatever you want, it will only be executed 
    #as often as you intend it to
    #execute the function that starts multiprocessing, 
    #in this case mp_factorizer()
    #there is no worker function code here, it's in another file.

即使整个主进程被保护，只要工作函数在另一个文件中，仍然可以启动它。

伪代码主线程，包含工作函数：

# Import stuff
#If the worker code is in the main thread, exclude it from the if statement:
def worker():
    #worker code
if __name__ == '__main__':
    #execute whatever you want, it will only be executed 
    #as often as you intend it to
    #execute the function that starts multiprocessing, 
    #in this case mp_factorizer()
#All code outside of the if statement will be executed multiple times
#depending on the # of assigned worker threads.

想要更详细的解释和可运行的代码，可以查看在子进程中使用多进程

回答于 2025-04-17 由 Python大师

分享举报

Python多进程函数返回多个输出

2 个回答

撰写回答