Python: 这样重写 __eq__ 和 __hash__ 可以吗?

24 投票
1 回答
32941 浏览
提问于 2025-04-16 00:11

我刚开始学习Python,想确认一下我是否正确地重写了__eq____hash__这两个方法,以免以后出现麻烦的错误:

(我在使用Google App Engine。)

class Course(db.Model):
    dept_code = db.StringProperty()
    number = db.IntegerProperty()
    title = db.StringProperty()
    raw_pre_reqs = db.StringProperty(multiline=True)
    original_description = db.StringProperty()

    def getPreReqs(self):
        return pickle.loads(str(self.raw_pre_reqs))

    def __repr__(self):
        title_msg = self.title if self.title else "Untitled"
        return "%s %s: %s" % (self.dept_code, self.number, title_msg)

    def __attrs(self):
        return (self.dept_code, self.number, self.title, self.raw_pre_reqs, self.original_description)

    def __eq__(self, other):
        return isinstance(other, Course) and self.__attrs() == other.__attrs()

    def __hash__(self):
        return hash(self.__attrs())

这是一个稍微复杂一点的类型:

class DependencyArcTail(db.Model):
    ''' A list of courses that is a pre-req for something else '''
    courses = db.ListProperty(db.Key)

    ''' a list of heads that reference this one '''
    forwardLinks = db.ListProperty(db.Key)

    def __repr__(self):
        return "DepArcTail %d: courses='%s' forwardLinks='%s'" % (id(self), getReprOfKeys(self.courses), getIdOfKeys(self.forwardLinks))

    def __eq__(self, other):
        if not isinstance(other, DependencyArcTail):
            return False

        for this_course in self.courses:
            if not (this_course in other.courses):
                return False

        for other_course in other.courses:
            if not (other_course in self.courses):
                return False

        return True

    def __hash__(self):
        return hash((tuple(self.courses), tuple(self.forwardLinks)))

这样看起来都没问题吧?

根据@Alex的评论进行了更新

class DependencyArcTail(db.Model):
    ''' A list of courses that is a pre-req for something else '''
    courses = db.ListProperty(db.Key)

    ''' a list of heads that reference this one '''
    forwardLinks = db.ListProperty(db.Key)

    def __repr__(self):
        return "DepArcTail %d: courses='%s' forwardLinks='%s'" % (id(self), getReprOfKeys(self.courses), getIdOfKeys(self.forwardLinks))

    def __eq__(self, other):
        return isinstance(other, DependencyArcTail) and set(self.courses) == set(other.courses) and set(self.forwardLinks) == set(other.forwardLinks)

    def __hash__(self):
        return hash((tuple(self.courses), tuple(self.forwardLinks)))

1 个回答

18

第一个是没问题的。第二个有两个问题:

  1. .courses 中可能会有重复的内容。
  2. 如果两个对象的 .courses 一样,但 .forwardLinks 不一样,它们在比较时会被认为是相等的,但它们的哈希值却不同。

我会通过让相等性依赖于课程和前向链接来解决第二个问题,同时确保集合中没有重复的内容,这样哈希值也会相同。也就是说:

def __eq__(self, other):
    if not isinstance(other, DependencyArcTail):
        return False

    return (set(self.courses) == set(other.courses) and
            set(self.forwardLinks) == set(other.forwardLinks))

def __hash__(self):
    return hash((frozenset(self.courses), frozenset(self.forwardLinks)))

当然,这里假设前向链接对一个对象的“真实价值”是非常重要的,否则它们应该从 __eq____hash__ 中去掉。

编辑:从 __hash__ 中移除了对 tuple 的调用,这样做最多是多余的(而且可能会有害,正如 @Mark 的评论所提到的 [[tx!!!]]);在哈希中将 set 改为 frozenset,这是 @Phillips 的评论建议的 [[tx!!!]]。

撰写回答