如何在Python中停止读取进程输出而不阻塞?

15 投票
5 回答
11681 浏览
提问于 2025-04-16 08:24

我有一个在Linux上运行的Python程序,差不多长这样:

import os
import time

process = os.popen("top").readlines()

time.sleep(1)

os.popen("killall top")

print process

程序在这一行卡住了:

process = os.popen("top").readlines()

这种情况发生在那些不断更新输出的工具里,比如“Top”命令。

我尝试过的最好方法:

import os
import time
import subprocess

process = subprocess.Popen('top')

time.sleep(2)

os.popen("killall top")

print process

这个方法比第一个好(它被杀掉了),但是返回了:

<subprocess.Popen object at 0x97a50cc>

第二次尝试:

import os
import time
import subprocess

process = subprocess.Popen('top').readlines()

time.sleep(2)

os.popen("killall top")

print process

结果和第一次一样。它因为“readlines()”而卡住了。

它的返回结果应该是这样的:

top - 05:31:15 up 12:12,  5 users,  load average: 0.25, 0.14, 0.11
Tasks: 174 total,   2 running, 172 sleeping,   0 stopped,   0 zombie
Cpu(s):  9.3%us,  3.8%sy,  0.1%ni, 85.9%id,  0.9%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   1992828k total,  1849456k used,   143372k free,   233048k buffers
Swap:  4602876k total,        0k used,  4602876k free,  1122780k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
31735 Barakat   20   0  246m  52m  20m S 19.4  2.7  13:54.91 totem              
 1907 root      20   0 91264  45m  15m S  1.9  2.3  38:54.14 Xorg               
 2138 Barakat   20   0 17356 5368 4284 S  1.9  0.3   3:00.15 at-spi-registry    
 2164 Barakat    9 -11  164m 7372 6252 S  1.9  0.4   2:54.58 pulseaudio         
 2394 Barakat   20   0 27212 9792 8256 S  1.9  0.5   6:01.48 multiload-apple    
 6498 Barakat   20   0 56364  30m  18m S  1.9  1.6   0:03.38 pyshell            
    1 root      20   0  2880 1416 1208 S  0.0  0.1   0:02.02 init               
    2 root      20   0     0    0    0 S  0.0  0.0   0:00.02 kthreadd           
    3 root      RT   0     0    0    0 S  0.0  0.0   0:00.12 migration/0        
    4 root      20   0     0    0    0 S  0.0  0.0   0:02.07 ksoftirqd/0        
    5 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 watchdog/0         
    9 root      20   0     0    0    0 S  0.0  0.0   0:01.43 events/0           
   11 root      20   0     0    0    0 S  0.0  0.0   0:00.00 cpuset             
   12 root      20   0     0    0    0 S  0.0  0.0   0:00.02 khelper            
   13 root      20   0     0    0    0 S  0.0  0.0   0:00.00 netns              
   14 root      20   0     0    0    0 S  0.0  0.0   0:00.00 async/mgr          
   15 root      20   0     0    0    0 S  0.0  0.0   0:00.00 pm

并且应该保存在变量“process”里。大家有没有什么主意,我现在真的卡住了?

5 个回答

0

(J.F. Sebastian,你的代码运行得很好,我觉得比我的解决方案更好 =) )

我用另一种方法解决了这个问题。

我没有直接在终端上输出结果,而是把它写入了一个文件“tmp_file”:

top >> tmp_file

然后我使用了“cut”这个工具,把它的输出(也就是最上面的输出)作为进程的值。

cat tmp_file

这样做达到了我想要的效果。这是最终的代码:

import os
import subprocess
import time

subprocess.Popen("top >> tmp_file",shell = True)

time.sleep(1)

os.popen("killall top")

process = os.popen("cat tmp_file").read()

os.popen("rm tmp_file")

print process

# Thing better than nothing =)

非常感谢大家的帮助!

3

与其使用“top”命令,我建议你用“ps”命令,这样可以得到相同的信息,但只会显示一次,而不是每秒都更新一次,永远不停。

你还需要在使用ps时加上一些选项,我通常用“ps aux”。

26

只打印输出部分的尾部解决方案

#!/usr/bin/env python
"""Start process; wait 2 seconds; kill the process; print all process output."""
import subprocess
import tempfile
import time

def main():
    # open temporary file (it automatically deleted when it is closed)
    #  `Popen` requires `f.fileno()` so `SpooledTemporaryFile` adds nothing here
    f = tempfile.TemporaryFile() 

    # start process, redirect stdout
    p = subprocess.Popen(["top"], stdout=f)

    # wait 2 seconds
    time.sleep(2)

    # kill process
    #NOTE: if it doesn't kill the process then `p.wait()` blocks forever
    p.terminate() 
    p.wait() # wait for the process to terminate otherwise the output is garbled

    # print saved output
    f.seek(0) # rewind to the beginning of the file
    print f.read(), 
    f.close()

if __name__=="__main__":
    main()

你可以在另一个线程中读取进程的输出,并把需要的最后几行保存到一个队列里:

import collections
import subprocess
import time
import threading

def read_output(process, append):
    for line in iter(process.stdout.readline, ""):
        append(line)

def main():
    # start process, redirect stdout
    process = subprocess.Popen(["top"], stdout=subprocess.PIPE, close_fds=True)
    try:
        # save last `number_of_lines` lines of the process output
        number_of_lines = 200
        q = collections.deque(maxlen=number_of_lines) # atomic .append()
        t = threading.Thread(target=read_output, args=(process, q.append))
        t.daemon = True
        t.start()

        #
        time.sleep(2)
    finally:
        process.terminate() #NOTE: it doesn't ensure the process termination

    # print saved lines
    print ''.join(q)

if __name__=="__main__":
    main()

这个方法需要 q.append() 是一个原子操作。否则,输出可能会出现问题。

signal.alarm() 解决方案

你可以使用 signal.alarm() 在指定的超时时间后调用 process.terminate(),而不是在另一个线程中读取。不过,这个方法可能和 subprocess 模块的配合不是很好。根据 @Alex Martelli的回答

import collections
import signal
import subprocess

class Alarm(Exception):
    pass

def alarm_handler(signum, frame):
    raise Alarm

def main():
    # start process, redirect stdout
    process = subprocess.Popen(["top"], stdout=subprocess.PIPE, close_fds=True)

    # set signal handler
    signal.signal(signal.SIGALRM, alarm_handler)
    signal.alarm(2) # produce SIGALRM in 2 seconds

    try:
        # save last `number_of_lines` lines of the process output
        number_of_lines = 200
        q = collections.deque(maxlen=number_of_lines)
        for line in iter(process.stdout.readline, ""):
            q.append(line)
        signal.alarm(0) # cancel alarm
    except Alarm:
        process.terminate()
    finally:
        # print saved lines
        print ''.join(q)

if __name__=="__main__":
    main()

这种方法只在 *nix 系统上有效。如果 process.stdout.readline() 不返回,可能会导致阻塞。

threading.Timer 解决方案

import collections
import subprocess
import threading

def main():
    # start process, redirect stdout
    process = subprocess.Popen(["top"], stdout=subprocess.PIPE, close_fds=True)

    # terminate process in timeout seconds
    timeout = 2 # seconds
    timer = threading.Timer(timeout, process.terminate)
    timer.start()

    # save last `number_of_lines` lines of the process output
    number_of_lines = 200
    q = collections.deque(process.stdout, maxlen=number_of_lines)
    timer.cancel()

    # print saved lines
    print ''.join(q),

if __name__=="__main__":
    main()

这种方法在 Windows 上也应该有效。在这里,我把 process.stdout 当作一个可迭代对象使用;这可能会引入额外的输出缓冲。如果不想这样,你可以改用 iter(process.stdout.readline, "") 的方法。如果进程在 process.terminate() 时没有结束,脚本就会卡住。

没有线程,没有信号的解决方案

import collections
import subprocess
import sys
import time

def main():
    args = sys.argv[1:]
    if not args:
        args = ['top']

    # start process, redirect stdout
    process = subprocess.Popen(args, stdout=subprocess.PIPE, close_fds=True)

    # save last `number_of_lines` lines of the process output
    number_of_lines = 200
    q = collections.deque(maxlen=number_of_lines)

    timeout = 2 # seconds
    now = start = time.time()    
    while (now - start) < timeout:
        line = process.stdout.readline()
        if not line:
            break
        q.append(line)
        now = time.time()
    else: # on timeout
        process.terminate()

    # print saved lines
    print ''.join(q),

if __name__=="__main__":
    main()

这个方法既不使用线程,也不使用信号,但在终端中会产生乱码输出。如果 process.stdout.readline() 阻塞,程序也会卡住。

撰写回答