如何检测Django应用中的死锁（并解决它们）

1 投票

2 回答

2439 浏览

数据工程师

提问于 2025-04-17 09:56

我正在维护一个django项目，这个项目经常变得无响应。到目前为止，我的解决办法是不断监控这个应用，并在必要时重启apache。

无响应是什么意思呢？就是apache不再对任何请求做出回应。

环境信息：

操作系统：Debian Squeeze 64位
网页服务器：Apache 2.2.16 mod_wsgi（之前使用mod_python大约一年）
Django版本：1.3.1（从1.0开始的每个主要版本）
Python版本：2.6.6 + virtualenv（使用distribute，没有site-packages，之前有几种不同的配置在运行）
数据库后端：psycopg2 2.3.2
数据库：PostgreSQL 9.0（之前使用过8.3版本）
连接池：pgbouncer（如果不使用bouncer，问题依然存在）
反向代理：nginx 1.0.11

我该怎么做才能更接近错误的根源呢？（我不能提供源代码，不过可以提供一些代码片段）我追踪这个问题已经很久了，几乎不可能列出我尝试过的所有方法。我试图去掉我能想到的任何“魔法”设置。自从问题出现以来，应用的几个部分已经被重写。

对于信息的缺乏，我感到抱歉，但我很乐意提供（几乎）任何请求的信息，并承诺尽力让这篇帖子对其他面临类似问题的人尽可能有帮助。

django apache nginx postgresql web application performance troubleshooting pgbouncer deadlock detection

2 个回答

你可能会遇到以下这个Django的bug [1]（在1.4版本中还没有修复）

解决方法：手动将这个修复应用到你的Django源代码中，或者像下面这样使用一个线程安全的包装器来处理wsgi模块（我们在生产系统中使用这个方法）

from __future__ import with_statement
from  django.core.handlers.wsgi import WSGIHandler as DjangoWSGIHandler

from threading import Lock

__copyright__ = "Jibe"

class WSGIHandler(DjangoWSGIHandler):
    """
    This provides a threadsafe drop-in replacement of django's WSGIHandler.

    Initialisation of django via a multithreaded wsgi handler is not safe.
    It is vulnerable to a A-B B-A deadlock.

When two threads bootstrap django via different urls you have a change to hit 
the following deadlock.

  thread 1                                               thread  2
    view A                                                  view B
     import file foo            import lock foo               import file bar  import lock bar
           bootstrap django     lock AppCache.write_lock
                import file bar import lock bar  <-- blocks
                                                                 bootstrap django    lock AppCache.write_lock  <----- deadlock

workaround for an AB BA deadlock:  wrap it in a lock C.

        lock C                      lock C
            lock A                      lock B
            lock B                      lock A
            release B                   release A
            release A                   release A
        release C                   release C          

    Thats exactly what this class does,  but... only for the first few calls.  
    After that we remove the lock C.  as the AppCache.write_lock is only held when django is booted. 

    If we would not remove the lock C after the first few calls, that would make the whole app single threaded again. 

    Usage:    
        in your wsgi file replace   the following lines 
                import django.core.handlers.wsgi.WSGIHandler  
                application = django.core.handlers.wsgi.WSGIHandler 
        by 
                import threadsafe_wsgi 
                application = threadsafe_wsgi.WSGIHandler 


    FAQ: 
        Q: why would you want threading in the first place ?                 
        A: to reduce memory. Big apps can consume hundeds of megabytes each.  adding processes is then much more expensive than threads. 
           that memory is better spend caching, when threads are almost free. 

        Q: this deadlock, it looks far-fetched, is this real ? 
        A: yes we had this problem on production machines. 
    """ 
    __initLock = Lock()  # lock C 
    __initialized = 0 

    def __call__(self, environ, start_response): 
        # the first calls (4) we squeeze everybody through lock C 
        # this basically serializes all threads 
        MIN_INIT_CALLS = 4 
        if self.__initialized < MIN_INIT_CALLS: 
            with self.__initLock: 
                ret = DjangoWSGIHandler.__call__(self, environ, start_response) 
                self.__initialized += 1 
                return ret 
        else: 
            # we are safely bootrapped, skip lock C 
            # now we are running multi-threaded again 
            return  DjangoWSGIHandler.__call__(self, environ, start_response)

然后在你的wsgi.py文件中使用以下代码

from threadsafe_wsgi.handlers import WSGIHandler
django_handler = WSGIHandler()

[1] https://code.djangoproject.com/ticket/18251

回答于 2025-04-17 由 Python大师

分享举报

最终，你需要的是mod_wsgi 4.0中新增的功能。这些功能可以让你更好地控制当请求被阻塞时的自动重启。在遇到阻塞的情况时，mod_wsgi会尝试输出Python的堆栈跟踪信息，这样你就能看到每个Python请求线程在当时正在做什么，从而了解它们为什么会被阻塞。

建议你在mod_wsgi的邮件列表上提这个问题，如果需要的话，我可以更详细地解释这些新功能。我之前也在这里发过相关内容：

http://groups.google.com/group/modwsgi/msg/2a968d820e18e97d

目前，mod_wsgi 4.0的代码只能从源代码库获取。现在的主干版本被认为是稳定的。

回答于 2025-04-17 由 Python大师

分享举报

如何检测Django应用中的死锁（并解决它们）

2 个回答

撰写回答