subprocess.check_output与subprocess.call的性能比较

32 投票

2 回答

27804 浏览

提问于 2025-04-18 17:27

我之前一直在用 subprocess.check_output() 来获取子进程的输出，但在某些情况下遇到了一些性能问题。我是在一台 RHEL6 的机器上运行这个。

我使用的 Python 环境是编译在 Linux 上的 64 位版本。我要执行的子进程是一个 shell 脚本，最终通过 Wine 启动一个 Windows 的 python.exe 进程（为什么要这么做是另一个故事）。我给这个 shell 脚本的输入是一些小的 Python 代码，这些代码会传递给 python.exe。

当系统负载中等或较重（CPU 使用率在 40% 到 70% 之间）时，我发现使用 subprocess.check_output(cmd, shell=True) 会导致一个显著的延迟（最长可达约 45 秒），在子进程执行完后，check_output 命令才会返回。这个时候，通过 ps -efH 查看输出，会看到调用的子进程显示为 sh <defunct>，直到它最终以正常的零退出状态返回。

相反，使用 subprocess.call(cmd, shell=True) 在同样的中等或重负载下运行相同的命令时，子进程会立即返回，没有延迟，所有输出都会直接打印到 STDOUT/STDERR（而不是从函数调用中返回）。

为什么只有在 check_output() 将 STDOUT/STDERR 的输出重定向到返回值时，会出现这么明显的延迟，而 call() 只是将其打印回父进程的 STDOUT/STDERR 时却没有呢？

Linux subprocess process management shell scripting output redirection performance comparison stdio system load

2 个回答

我们来看一下代码。.check_output 这个方法有一个等待的过程：

    def _internal_poll(self, _deadstate=None, _waitpid=os.waitpid,
            _WNOHANG=os.WNOHANG, _os_error=os.error, _ECHILD=errno.ECHILD):
        """Check if child process has terminated.  Returns returncode
        attribute.

        This method is called by __del__, so it cannot reference anything
        outside of the local scope (nor can any methods it calls).

        """
        if self.returncode is None:
            try:
                pid, sts = _waitpid(self.pid, _WNOHANG)
                if pid == self.pid:
                    self._handle_exitstatus(sts)
            except _os_error as e:
                if _deadstate is not None:
                    self.returncode = _deadstate
                if e.errno == _ECHILD:
                    # This happens if SIGCLD is set to be ignored or
                    # waiting for child processes has otherwise been
                    # disabled for our process.  This child is dead, we
                    # can't get the status.
                    # http://bugs.python.org/issue15756
                    self.returncode = 0
        return self.returncode

.call 这个方法的等待过程是这样的：

    def wait(self):
        """Wait for child process to terminate.  Returns returncode
        attribute."""
        while self.returncode is None:
            try:
                pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0)
            except OSError as e:
                if e.errno != errno.ECHILD:
                    raise
                # This happens if SIGCLD is set to be ignored or waiting
                # for child processes has otherwise been disabled for our
                # process.  This child is dead, we can't get the status.
                pid = self.pid
                sts = 0
            # Check the pid and loop as waitpid has been known to return
            # 0 even without WNOHANG in odd situations.  issue14396.
            if pid == self.pid:
                self._handle_exitstatus(sts)
        return self.returncode

注意到和 internal_poll 相关的bug。你可以在这个链接查看：http://bugs.python.org/issue15756。这基本上就是你遇到的问题。

补充： .call 和 .check_output 之间的另一个潜在问题是，.check_output 实际上会关注标准输入和标准输出，并会尝试对这两个管道进行输入输出操作。如果你遇到的进程变成了“僵尸”状态，可能是因为在一个已经失效的管道上进行读取，导致了你所经历的卡住现象。

在大多数情况下，僵尸状态会很快被清理掉，但如果它们在系统调用（比如读取或写入）时被中断，就可能不会被清理。当然，读取/写入的系统调用应该在无法再进行输入输出操作时立即被中断，但有可能你遇到了某种竞争条件，导致事情以不好的顺序被终止。

我能想到的唯一方法来确定造成这个问题的原因，就是你要么在子进程的文件中添加调试代码，要么在遇到你所经历的情况时，调用Python调试器并进行回溯。

回答于 2025-04-18 由 Python大师

分享举报

根据文档，subprocess.call 和 subprocess.check_output 都是 subprocess.Popen 的使用方式。它们之间有一个小区别，就是如果子进程返回的状态码不是零，check_output 会抛出一个Python错误。而更大的区别在于关于 check_output 的一段描述（我强调的部分）：

这个函数的完整签名基本上和 Popen 构造函数是一样的，唯一的不同是 stdout 不被允许，因为它在内部使用。其他所有提供的参数都会直接传递给 Popen 构造函数。

那么 stdout 是怎么“在内部使用”的呢？我们来对比一下 call 和 check_output：

call

def call(*popenargs, **kwargs):
    return Popen(*popenargs, **kwargs).wait()

check_output

def check_output(*popenargs, **kwargs):
    if 'stdout' in kwargs:
        raise ValueError('stdout argument not allowed, it will be overridden.')
    process = Popen(stdout=PIPE, *popenargs, **kwargs)
    output, unused_err = process.communicate()
    retcode = process.poll()
    if retcode:
        cmd = kwargs.get("args")
        if cmd is None:
            cmd = popenargs[0]
        raise CalledProcessError(retcode, cmd, output=output)
    return output

communicate

现在我们还得看看 Popen.communicate。通过这个，我们注意到对于一个管道，communicate 做了几件事情，这些事情显然比简单地返回 Popen().wait()（就像 call 所做的）要花更多的时间。

首先，无论你是否设置 shell=True，communicate 都会处理 stdout=PIPE。显然，call 并没有这样做。它只是让你的 shell 随意输出... 这就带来了安全风险，正如Python在这里所描述的。

其次，在 check_output(cmd, shell=True) 的情况下（只有一个管道）... 子进程发送到 stdout 的内容会被 _communicate 方法中的一个线程处理。而 Popen 必须等待这个线程结束（加入线程），然后再等待子进程本身结束！

另外，更简单的一点是，它会把 stdout 处理成一个 list，然后再把它合并成一个字符串。

总之，即使参数很少，check_output 在Python进程中花费的时间也比 call 多得多。