我们注意到,在我们的一个部署中,有一堆已经失效的(僵尸)进程被遗留下来,并设法生成了一个非常小的程序来显示问题:
multi.py:
from multiprocessing import Pool, set_start_method
def f(x):
return x*x
if __name__ == '__main__':
set_start_method('spawn')
with Pool(5) as p:
print(p.map(f, [1, 2, 3]))
p.close()
p.join()
这个程序似乎离开了僵尸进程,但很难捕获,因为从常规shell运行这个程序将导致shell捕获僵尸
在我们的部署中,这是从另一个python程序运行的,因此为了模拟它,我们有:
main.py:
from subprocess import run
from time import sleep
while True:
result = run(["python", "multi.py"], capture_output=True)
print(result.stdout.decode('utf-8'))
result = run(["ps", "-ef", "--forest"], capture_output=True)
print(result.stdout.decode('utf-8'), flush=True)
sleep(1)
运行main.py产生以下输出:
[1, 4, 9]
UID PID PPID C STIME TTY TIME CMD
root 1 0 11 11:33 pts/0 00:00:00 python main.py
root 8 1 0 11:33 pts/0 00:00:00 [python] <defunct>
root 17 1 0 11:33 pts/0 00:00:00 ps -ef --forest
[1, 4, 9]
UID PID PPID C STIME TTY TIME CMD
root 1 0 6 11:33 pts/0 00:00:00 python main.py
root 8 1 3 11:33 pts/0 00:00:00 [python] <defunct>
root 19 1 0 11:33 pts/0 00:00:00 [python] <defunct>
root 28 1 0 11:33 pts/0 00:00:00 ps -ef --forest
[1, 4, 9]
UID PID PPID C STIME TTY TIME CMD
root 1 0 4 11:33 pts/0 00:00:00 python main.py
root 8 1 1 11:33 pts/0 00:00:00 [python] <defunct>
root 19 1 3 11:33 pts/0 00:00:00 [python] <defunct>
root 30 1 0 11:33 pts/0 00:00:00 [python] <defunct>
root 39 1 0 11:33 pts/0 00:00:00 ps -ef --forest
[1, 4, 9]
UID PID PPID C STIME TTY TIME CMD
root 1 0 3 11:33 pts/0 00:00:00 python main.py
root 8 1 1 11:33 pts/0 00:00:00 [python] <defunct>
root 19 1 1 11:33 pts/0 00:00:00 [python] <defunct>
root 30 1 4 11:33 pts/0 00:00:00 [python] <defunct>
root 41 1 0 11:33 pts/0 00:00:00 [python] <defunct>
root 50 1 0 11:33 pts/0 00:00:00 ps -ef --forest
另一方面,以下程序不会产生失效的进程:
主信号py:
from os import wait
import signal
from subprocess import run
from time import sleep
def chld_handler(_signum, _frame):
wait()
signal.signal(signal.SIGCHLD, chld_handler)
while True:
result = run(["python", "multi.py"], capture_output=True)
print(result.stdout.decode('utf-8'))
result = run(["ps", "-ef", "--forest"], capture_output=True)
print(result.stdout.decode('utf-8'), flush=True)
sleep(1)
另外,下面的简单shell脚本deoes不产生僵尸:
#!/usr/bin/env bash
while :; do
python multi.py
ps -ef --forest
sleep 1
done
这是Python中的一个bug,还是您需要处理来自子进程的任何僵尸(就像Bash看起来所做的那样)
所有代码和Dockerfile
都可以在此处轻松复制该问题:
https://github.com/viktorvia/python-multi-issue
该问题可在Python 3.9.6、3.7.4和3.7.11中重现
目前没有回答
相关问题 更多 >
编程相关推荐