为什么显式调用魔术方法比“糖化”语法慢？

Question

我在玩一个小的自定义数据对象，这个对象需要能够被哈希、可比较，并且运行速度要快。结果我发现了一些奇怪的时间测试结果。有些比较（还有哈希方法）其实只是简单地调用了一个属性，所以我用了类似这样的代码：

def __hash__(self):
    return self.foo.__hash__()

但是在测试时，我发现 hash(self.foo) 的速度明显更快。出于好奇，我又测试了 __eq__、__ne__ 和其他一些特殊的比较方法，结果发现如果我使用更简洁的写法（比如 ==、!=、< 等），它们的运行速度都更快。这是为什么呢？我原本以为这些简洁的写法在后台会调用同样的函数，但可能并不是这样？

时间测试结果

设置：在一个实例属性上加了一层薄薄的包装，控制所有的比较。

Python 3.3.4 (v3.3.4:7ff62415e426, Feb 10 2014, 18:13:51) [MSC v.1600 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit
>>> 
>>> sugar_setup = '''\
... import datetime
... class Thin(object):
...     def __init__(self, f):
...             self._foo = f
...     def __hash__(self):
...             return hash(self._foo)
...     def __eq__(self, other):
...             return self._foo == other._foo
...     def __ne__(self, other):
...             return self._foo != other._foo
...     def __lt__(self, other):
...             return self._foo < other._foo
...     def __gt__(self, other):
...             return self._foo > other._foo
... '''
>>> explicit_setup = '''\
... import datetime
... class Thin(object):
...     def __init__(self, f):
...             self._foo = f
...     def __hash__(self):
...             return self._foo.__hash__()
...     def __eq__(self, other):
...             return self._foo.__eq__(other._foo)
...     def __ne__(self, other):
...             return self._foo.__ne__(other._foo)
...     def __lt__(self, other):
...             return self._foo.__lt__(other._foo)
...     def __gt__(self, other):
...             return self._foo.__gt__(other._foo)
... '''

测试

我的自定义对象是包装了一个 datetime，所以我用的就是这个，但这应该没什么影响。是的，我在测试中创建了这些日期时间，所以肯定会有一些额外的开销，但这个开销在不同的测试之间是恒定的，所以不应该影响结果。我省略了 __ne__ 和 __gt__ 的测试结果，主要是为了简洁，但那些结果和这里显示的基本相同。

>>> test_hash = '''\
... for i in range(1, 1000):
...     hash(Thin(datetime.datetime.fromordinal(i)))
... '''
>>> test_eq = '''\
... for i in range(1, 1000):
...     a = Thin(datetime.datetime.fromordinal(i))
...     b = Thin(datetime.datetime.fromordinal(i+1))
...     a == a # True
...     a == b # False
... '''
>>> test_lt = '''\
... for i in range(1, 1000):
...     a = Thin(datetime.datetime.fromordinal(i))
...     b = Thin(datetime.datetime.fromordinal(i+1))
...     a < b # True
...     b < a # False
... '''

结果

>>> min(timeit.repeat(test_hash, explicit_setup, number=1000, repeat=20))
1.0805227295846862
>>> min(timeit.repeat(test_hash, sugar_setup, number=1000, repeat=20))
1.0135617737162192
>>> min(timeit.repeat(test_eq, explicit_setup, number=1000, repeat=20))
2.349765956168767
>>> min(timeit.repeat(test_eq, sugar_setup, number=1000, repeat=20))
2.1486044757355103
>>> min(timeit.repeat(test_lt, explicit_setup, number=500, repeat=20))
1.156479287717275
>>> min(timeit.repeat(test_lt, sugar_setup, number=500, repeat=20))
1.0673696685109917

哈希:
- 显式: 1.0805227295846862
- 简洁: 1.0135617737162192
相等:
- 显式: 2.349765956168767
- 简洁: 2.1486044757355103
小于:
- 显式: 1.156479287717275
- 简洁: 1.0673696685109917

代码优化时间复杂度性能测试属性访问自定义对象哈希函数魔术方法比较操作

为什么显式调用魔术方法比“糖化”语法慢？

时间测试结果

测试

结果

1 个回答

撰写回答