多线程脚本的事后调试

8 投票
3 回答
1594 浏览
提问于 2025-04-18 03:16

我正在调试一个多线程的脚本。当出现异常时,我想要做到以下几点:

  1. 将异常报告给监控系统(在下面的例子中只是打印出来)
  2. 停止整个脚本(包括所有其他线程)
  3. 在出现异常后调用调试器提示符

我准备了一个比较复杂的例子来展示我尝试解决这个问题的方法:

#!/usr/bin/env python

import threading
import inspect
import traceback
import sys
import os
import time


def POST_PORTEM_DEBUGGER(type, value, tb):
    traceback.print_exception(type, value, tb)
    print
    if hasattr(sys, 'ps1') or not sys.stderr.isatty():
        import rpdb
        rpdb.pdb.pm()
    else:
        import pdb
        pdb.pm()

sys.excepthook = POST_PORTEM_DEBUGGER



class MyThread(threading.Thread):

    def __init__(self):

        threading.Thread.__init__(self)
        self.exception = None
        self.info = None
        self.the_calling_script_name = os.path.abspath(inspect.currentframe().f_back.f_code.co_filename)

    def main(self):
        "Virtual method to be implemented by inherited worker"
        return self

    def run(self):
        try:
            self.main()
        except Exception as exception:
            self.exception = exception
            self.info = traceback.extract_tb(sys.exc_info()[2])[-1]
            # because of bug http://bugs.python.org/issue1230540
            # I cannot use just "raise" under threading.Thread
            sys.excepthook(*sys.exc_info())

    def __del__(self):
        print 'MyThread via {} catch "{}: {}" in {}() from {}:{}: {}'.format(self.the_calling_script_name, type(self.exception).__name__, str(self.exception), self.info[2], os.path.basename(self.info[0]), self.info[1], self.info[3])




class Worker(MyThread):

    def __init__(self):
        super(Worker, self).__init__()

    def main(self):
        """ worker job """
        counter = 0
        while True:
            counter += 1
            print self
            time.sleep(1.0)
            if counter == 3:
                pass # print 1/0


def main():

    Worker().start()

    counter = 1
    while True:
        counter += 1
        time.sleep(1.0)
        if counter == 3:
            pass # print 1/0

if __name__ == '__main__':
    main()

使用

sys.excepthook = POST_PORTEM_DEBUGGER

在没有线程的情况下效果很好。但我发现,在多线程脚本中,我可以使用rpdb进行调试,方法是调用:

import rpdb; rpdb.set_trace()

这个方法在设置的断点处效果很好,但我想在出现未捕获的异常后调试多线程脚本(也就是在异常发生后)。当我尝试在多线程应用的POST_PORTEM_DEBUGGER函数中使用rpdb时,我得到了以下结果:

Exception in thread Thread-1:
Traceback (most recent call last):
    File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
        self.run()
    File "./demo.py", line 49, in run
        sys.excepthook(*sys.exc_info())
    File "./demo.py", line 22, in POST_PORTEM_DEBUGGER
        pdb.pm()
    File "/usr/lib/python2.7/pdb.py", line 1270, in pm
        post_mortem(sys.last_traceback)
AttributeError: 'module' object has no attribute 'last_traceback'

看起来

sys.excepthook(*sys.exc_info())

没有设置好raise命令所做的所有事情。我希望在main()中出现异常时,即使在已启动的线程中,也能有相同的行为。

3 个回答

-1

这可能会有所帮助:

import sys
from IPython.core import ultratb

sys.excepthook = ultratb.FormattedTB(mode='Verbose', color_scheme='Linux',
    call_pdb=True, ostream=sys.__stdout__)
0

在@shx2的基础上,我现在在多线程的情况下使用以下模式。

import sys, pdb

try:
    ...  # logic that may fail
except exception as exc:
    pdb.post_mortem(exc.__traceback__)

这里有一个更详细的替代方案:

import sys, pdb

try:
    ...  # logic that may fail
except exception as exc:
    if hasattr(sys, "last_traceback"):
        pdb.pm()
    else:
        pdb.post_mortem(exc.__traceback__)
2

(我没有测试我的答案,但我觉得...)

调用 pdb.pm(pm="post mortem")失败的原因很简单,因为在它之前并没有发生过“死亡”。也就是说,程序还在运行中。

查看 pdb 的源代码,你会发现 pdb.pm 的实现:

def pm():
    post_mortem(sys.last_traceback)

这让我猜测你其实想做的是调用 pdb.post_mortem(),而且不需要传入任何参数。看起来默认的行为正好符合你的需求。

还有一些源代码(注意 t = sys.exc_info()[2] 这一行):

def post_mortem(t=None):
    # handling the default
    if t is None:
        # sys.exc_info() returns (type, value, traceback) if an exception is
        # being handled, otherwise it returns None
        t = sys.exc_info()[2]
        if t is None:
            raise ValueError("A valid traceback must be passed if no "
                                               "exception is being handled")

    p = Pdb()
    p.reset()
    p.interaction(None, t)

撰写回答