Python多进程线程在大量工作时永远不结束

3 投票
1 回答
1021 浏览
提问于 2025-04-18 03:40

我觉得这个问题并不是重复的,和这个不一样,因为那位提问者的问题是因为使用了multiprocessing.pool,而我并没有用这个。

这个程序:

import multiprocessing
import time

def task_a(procrange,result):
    "Naively identify prime numbers in an iterator of integers. Procrange may not contain negative numbers, 0, or 1. Result should be a multiprocessing.queue."

    for i in procrange: #For every number in our given iterator...
        for t in range (2,(i//2)+1): #Take every number up to half of it...
            if (i % t == 0): #And see if that number goes evenly into it.
                break   #If it does, it ain't prime.
        else:
            #print(i)
            result.put(i) #If the loop never broke, it's prime.




if __name__ == '__main__':
    #We seem to get the best times with 4 processes, which makes some sense since my machine has 4 cores (apparently hyperthreading doesn't do shit)
    #Time taken more or less halves for every process up to 4, then very slowly climbs back up again as overhead eclipses the benifit from concurrency
    processcount=4
    procs=[]
    #Will search up to this number.
    searchto=11000
    step=searchto//processcount
    results=multiprocessing.Queue(searchto)
    for t in range(processcount):
        procrange=range(step * t, step * (t+1) )
        print("Process",t,"will search from",step*t,"to",step*(t+1))
        procs.append(
                     multiprocessing.Process(target=task_a, name="Thread "+str(t),args=(procrange,results))
                     )
    starttime=time.time()
    for theproc in procs:
        theproc.start()
    print("Processing has begun.")

    for theproc in procs:
        theproc.join()
        print(theproc.name,"has terminated and joined.")
    print("Processing finished!")
    timetook=time.time()-starttime

    print("Compiling results...")

    resultlist=[]
    try:
        while True:
            resultlist.append(results.get(False))
    except multiprocessing.queues.Empty:
        pass

    print(resultlist)
    print("Took",timetook,"seconds to find",len(resultlist),"primes from 0 to",searchto,"with",processcount,"concurrent executions.")

...运行得很好,结果是:

Process 0 will search from 0 to 2750
Process 1 will search from 2750 to 5500
Process 2 will search from 5500 to 8250
Process 3 will search from 8250 to 11000
Processing has begun.
Thread 0 has terminated and joined.
Thread 1 has terminated and joined.
Thread 2 has terminated and joined.
Thread 3 has terminated and joined.
Processing finished!
Compiling results...
[Many Primes]
Took 0.3321540355682373 seconds to find 1337** primes from 0 to 11000 with 4 concurrent executions.

但是,如果把search_to的值增加500...

Processing has begun.
Thread 0 has terminated and joined.
Thread 1 has terminated and joined.
Thread 2 has terminated and joined.

...然后就没声音了。Process Hacker显示Python线程每个占用12%的CPU,逐渐减少,但没有结束。它们就这样挂着,直到我手动结束它们。

这是为什么呢?

** 显然,上天或者Guido(Python的创始人)有一种残酷的幽默感。

1 个回答

1

看起来问题出在 result.put(i) 这行代码上,因为当我把它提交后,脚本就开始正常工作了。所以我建议你不要用 multiprocessing.Queue 来保存结果。相反,你可以使用数据库,比如 MySQL、MongoDB 等。注意:你不能使用 SQLite,因为在 SQLite 中,任何时候只能有一个进程可以对数据库进行修改(来自 文档)。

撰写回答