我的Django查询在termin上的givig me数据非常慢

2024-03-29 11:30:37 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个用户表,其中有3种类型的用户学生,教师和俱乐部,我有一个大学表。 我想要的是在特定的大学里有多少用户。 我正在得到我想要的输出,但是输出非常慢。我有9万个用户,它生成的输出需要几分钟才能产生结果。你知道吗

我的用户型号:-你知道吗

from __future__ import unicode_literals
from django.db import models
from django.contrib.auth.models import User
from cms.models.masterUserTypes import MasterUserTypes
from cms.models.universities import Universities
from cms.models.departments import MasterDepartments



# WE ARE AT MODELS/APPUSERS

requestChoice = (
    ('male', 'male'),
    ('female', 'female'),
    )


class Users(models.Model):
    id = models.IntegerField(db_column="id", max_length=11, help_text="")
    userTypeId = models.ForeignKey(MasterUserTypes, db_column="userTypeId")
    universityId = models.ForeignKey(Universities, db_column="universityId")  
    departmentId = models.ForeignKey(MasterDepartments , db_column="departmentId",help_text="")  
    name = models.CharField(db_column="name",max_length=255,help_text="")
    username = models.CharField(db_column="username",unique=True, max_length=255,help_text="")
    email = models.CharField(db_column="email",unique=True, max_length=255,help_text="")
    password = models.CharField(db_column="password",max_length=255,help_text="")
    bio = models.TextField(db_column="bio",max_length=500,help_text="")
    gender = models.CharField(db_column="gender",max_length=6, choices=requestChoice,help_text="")
    mobileNo = models.CharField(db_column='mobileNo', max_length=16,help_text="")  
    dob = models.DateField(db_column="dob",help_text="")
    major = models.CharField(db_column="major",max_length=255,help_text="")
    graduationYear = models.IntegerField(db_column='graduationYear',max_length=11,help_text="")  
    canAddNews = models.BooleanField(db_column='canAddNews',default=False,help_text="")  
    receivePrivateMsgNotification = models.BooleanField(db_column='receivePrivateMsgNotification',default=True ,help_text="")  
    receivePrivateMsg = models.BooleanField(db_column='receivePrivateMsg',default=True ,help_text="")
    receiveCommentNotification = models.BooleanField(db_column='receiveCommentNotification',default=True ,help_text="")  
    receiveLikeNotification = models.BooleanField(db_column='receiveLikeNotification',default=True ,help_text="")  
    receiveFavoriteFollowNotification = models.BooleanField(db_column='receiveFavoriteFollowNotification',default=True ,help_text="")  
    receiveNewPostNotification = models.BooleanField(db_column='receiveNewPostNotification',default=True ,help_text="")  
    allowInPopularList = models.BooleanField(db_column='allowInPopularList',default=True ,help_text="")  
    xmppResponse = models.TextField(db_column='xmppResponse',help_text="")  
    xmppDatetime = models.DateTimeField(db_column='xmppDatetime', help_text="")  
    status = models.BooleanField(db_column="status", default=False, help_text="")
    deactivatedByAdmin = models.BooleanField(db_column="deactivatedByAdmin", default=False, help_text="")
    createdAt = models.DateTimeField(db_column='createdAt', auto_now=True, help_text="")  
    modifiedAt = models.DateTimeField(db_column='modifiedAt', auto_now=True, help_text="")  
    updatedBy = models.ForeignKey(User,db_column="updatedBy",help_text="Logged in user updated by ......")
    lastPasswordReset = models.DateTimeField(db_column='lastPasswordReset',help_text="")
    authorities = models.CharField(db_column="departmentId",max_length=255,help_text="")

    class Meta:
        managed = False
        db_table = 'users'

我正在使用的查询产生了所需的输出,但是太慢了:-

universities = Universities.objects.using('cms').all()
    for item in universities:
        studentcount = Users.objects.using('cms').filter(universityId=item.id,userTypeId=2).count()
        facultyCount = Users.objects.using('cms').filter(universityId=item.id,userTypeId=1).count()
        clubCount = Users.objects.using('cms').filter(universityId=item.id,userTypeId=3).count()
        totalcount = Users.objects.using('cms').filter(universityId=item.id).count()
        print studentcount,facultyCount,clubCount,totalcount
        print item.name

Tags: textfromimportidtruedefaultdbcms
2条回答

首先,一个明显的优化。在循环中,基本上要对同一个查询执行四次:三次过滤不同的userTypeId,一次不过滤。您可以在一个COUNT(*) ... GROUP BY userTypeId查询中完成此操作。你知道吗

...
# Here, we're building a dict {userTypeId: count}
# by counting PKs over each userTypeId
qs = Users.objects.using('cms').filter(universityId=item.id)
counts = {
    x["userTypeId"]: x["cnt"]
    for x in qs.values('userTypeId').annotate(cnt=Count('pk'))
}

student_count = counts.get(2, 0)
faculty_count = counts.get(1, 0)
club_count = count.get(3, 0)
total_count = sum(count.values())  # Assuming there may be other userTypeIds
...

但是,您仍然在进行1+n查询,其中n是数据库中的大学数。如果这个数字很低,这是可以的,但是如果它很高,你需要进一步的聚合,加入UniversitiesUsers。我的初稿是这样的:

# Assuming University.name is unique, otherwise you'll need to use IDs
# to distinguish between different projects, instead of names.
qs = Users.objects.using('cms').values('userTypeId', 'university__name')\
    .annotate(cnt=Count('pk').order_by('university__name')
for name, group in itertools.groupby(qs, lambda x: x["university__name"]):
    print("University: %s" % name)
    cnts = {g["userTypeId"]: g["cnt"] for g in group}
    faculty, student, club = cnts.get(1, 0), cnts.get(2, 0), cnts.get(3, 0)
    # NOTE: I'm assuming there are only few (if any) userTypeId values
    #       other than {1,2,3}.
    total = sum(cnts.values())
    print("  Student: %d, faculty: %d, club: %d, total: %d" % (
          student, faculty, club, total))

我可能打错了,但希望是对的。就SQL而言,它应该发出如下查询

SELECT uni.name, usr.userTypeId, COUNT(usr.id)
FROM some_app_universities AS uni
LEFT JOUN some_app_users AS usr ON us.universityId = uni.id
GROUP BY uni.name, usr.userTypeId
ORDER BY uni.name

考虑阅读aggregations and annotations上的文档。一定要检查Django ORM发出的原始SQL(例如,使用Django Debug Toolbar),并分析它在数据库中的工作情况。例如,如果您使用的是PostgreSQL,请使用EXPLAIN SELECT。根据您的数据集,您可能会受益于其中的一些索引(例如userTypeId列)。你知道吗

哦,顺便说一句。。。这是离题的,但在Python中,让变量和属性使用lowercase_with_underscores是一种定制。在Django中,模型类名通常是单数的,例如UserUniversity。你知道吗

应该使用annotate获取每个大学的计数,并使用条件表达式获取基于条件(docs)的计数

Universities.objects.using('cms').annotate(
    studentcount=Sum(Case(When(users_set__userTypeId=2, then=1), output_field=IntegerField())),
    facultyCount =Sum(Case(When(users_set__userTypeId=1, then=1), output_field=IntegerField())),
    clubCount=Sum(Case(When(users_set__userTypeId=3, then=1), output_field=IntegerField())),
    totalcount=Count('users_set'),
)

相关问题 更多 >