多线程脚本的事后调试
我正在调试一个多线程的脚本。当出现异常时,我想要做到以下几点:
- 将异常报告给监控系统(在下面的例子中只是打印出来)
- 停止整个脚本(包括所有其他线程)
- 在出现异常后调用调试器提示符
我准备了一个比较复杂的例子来展示我尝试解决这个问题的方法:
#!/usr/bin/env python
import threading
import inspect
import traceback
import sys
import os
import time
def POST_PORTEM_DEBUGGER(type, value, tb):
traceback.print_exception(type, value, tb)
print
if hasattr(sys, 'ps1') or not sys.stderr.isatty():
import rpdb
rpdb.pdb.pm()
else:
import pdb
pdb.pm()
sys.excepthook = POST_PORTEM_DEBUGGER
class MyThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.exception = None
self.info = None
self.the_calling_script_name = os.path.abspath(inspect.currentframe().f_back.f_code.co_filename)
def main(self):
"Virtual method to be implemented by inherited worker"
return self
def run(self):
try:
self.main()
except Exception as exception:
self.exception = exception
self.info = traceback.extract_tb(sys.exc_info()[2])[-1]
# because of bug http://bugs.python.org/issue1230540
# I cannot use just "raise" under threading.Thread
sys.excepthook(*sys.exc_info())
def __del__(self):
print 'MyThread via {} catch "{}: {}" in {}() from {}:{}: {}'.format(self.the_calling_script_name, type(self.exception).__name__, str(self.exception), self.info[2], os.path.basename(self.info[0]), self.info[1], self.info[3])
class Worker(MyThread):
def __init__(self):
super(Worker, self).__init__()
def main(self):
""" worker job """
counter = 0
while True:
counter += 1
print self
time.sleep(1.0)
if counter == 3:
pass # print 1/0
def main():
Worker().start()
counter = 1
while True:
counter += 1
time.sleep(1.0)
if counter == 3:
pass # print 1/0
if __name__ == '__main__':
main()
使用
sys.excepthook = POST_PORTEM_DEBUGGER
在没有线程的情况下效果很好。但我发现,在多线程脚本中,我可以使用rpdb进行调试,方法是调用:
import rpdb; rpdb.set_trace()
这个方法在设置的断点处效果很好,但我想在出现未捕获的异常后调试多线程脚本(也就是在异常发生后)。当我尝试在多线程应用的POST_PORTEM_DEBUGGER函数中使用rpdb时,我得到了以下结果:
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
self.run()
File "./demo.py", line 49, in run
sys.excepthook(*sys.exc_info())
File "./demo.py", line 22, in POST_PORTEM_DEBUGGER
pdb.pm()
File "/usr/lib/python2.7/pdb.py", line 1270, in pm
post_mortem(sys.last_traceback)
AttributeError: 'module' object has no attribute 'last_traceback'
看起来
sys.excepthook(*sys.exc_info())
没有设置好raise
命令所做的所有事情。我希望在main()中出现异常时,即使在已启动的线程中,也能有相同的行为。
3 个回答
-1
这可能会有所帮助:
import sys
from IPython.core import ultratb
sys.excepthook = ultratb.FormattedTB(mode='Verbose', color_scheme='Linux',
call_pdb=True, ostream=sys.__stdout__)
0
在@shx2的基础上,我现在在多线程的情况下使用以下模式。
import sys, pdb
try:
... # logic that may fail
except exception as exc:
pdb.post_mortem(exc.__traceback__)
这里有一个更详细的替代方案:
import sys, pdb
try:
... # logic that may fail
except exception as exc:
if hasattr(sys, "last_traceback"):
pdb.pm()
else:
pdb.post_mortem(exc.__traceback__)
2
(我没有测试我的答案,但我觉得...)
调用 pdb.pm
(pm="post mortem")失败的原因很简单,因为在它之前并没有发生过“死亡”。也就是说,程序还在运行中。
查看 pdb
的源代码,你会发现 pdb.pm
的实现:
def pm():
post_mortem(sys.last_traceback)
这让我猜测你其实想做的是调用 pdb.post_mortem()
,而且不需要传入任何参数。看起来默认的行为正好符合你的需求。
还有一些源代码(注意 t = sys.exc_info()[2]
这一行):
def post_mortem(t=None):
# handling the default
if t is None:
# sys.exc_info() returns (type, value, traceback) if an exception is
# being handled, otherwise it returns None
t = sys.exc_info()[2]
if t is None:
raise ValueError("A valid traceback must be passed if no "
"exception is being handled")
p = Pdb()
p.reset()
p.interaction(None, t)