我如何从hashlib.sha256在Python中?

2024-04-18 15:55:55 发布

您现在位置:Python中文网/ 问答频道 /正文

天真的尝试失败得很惨:

import hashlib

class fred(hashlib.sha256):
    pass

-> TypeError: Error when calling the metaclass bases
       cannot create 'builtin_function_or_method' instances

嗯,事实证明hashlib.sha256是可调用的,不是类。尝试一些更有创意的东西也不管用:

^{pr2}$

嗯。。。在

那么,我该怎么做呢?在

以下是我想要实现的目标:

class shad_256(sha256):
    """Double SHA - sha256(sha256(data).digest())
Less susceptible to length extension attacks than sha256 alone."""
    def digest(self):
        return sha256(sha256.digest(self)).digest()
    def hexdigest(self):
        return sha256(sha256.digest(self)).hexdigest()

基本上,我希望所有的事情都能通过,除了有人调用结果时,我想插入我自己的额外步骤。有没有一种聪明的方法可以用__new__或某种元类魔术来实现这一点?在

我有一个很满意的解决方案,我把它作为一个答案贴出来了,但我真的很感兴趣,看看是否有人能想出更好的办法。或者在可读性方面花费很少的冗余或者更快(尤其是在调用update)的同时仍然具有一定的可读性。在

更新:我运行了一些测试:

# test_sha._timehash takes three parameters, the hash object generator to use,
# the number of updates and the size of the updates.

# Built in hashlib.sha256
$ python2.7 -m timeit -n 100 -s 'import test_sha, hashlib' 'test_sha._timehash(hashlib.sha256, 20000, 512)'
100 loops, best of 3: 104 msec per loop

# My wrapper based approach (see my answer)
$ python2.7 -m timeit -n 100 -s 'import test_sha, hashlib' 'test_sha._timehash(test_sha.wrapper_shad_256, 20000, 512)'
100 loops, best of 3: 108 msec per loop

# Glen Maynard's getattr based approach
$ python2.7 -m timeit -n 100 -s 'import test_sha, hashlib' 'test_sha._timehash(test_sha.getattr_shad_256, 20000, 512)'
100 loops, best of 3: 103 msec per loop

Tags: ofthetestimportselfhashlibbestsha
3条回答

所以,我根据格伦的回答得出了这个答案,这就是我给他赏金的原因:

import hashlib

class _double_wrapper(object):
    """This wrapper exists because the various hashes from hashlib are
    factory functions and there is no type that can be derived from.
    So this class simulates deriving from one of these factory
    functions as if it were a class and then implements the 'd'
    version of the hash function which avoids length extension attacks
    by applying H(H(text)) instead of just H(text)."""

    __slots__ = ('_wrappedinstance', '_wrappedfactory', 'update')
    def __init__(self, wrappedfactory, *args):
        self._wrappedfactory = wrappedfactory
        self._assign_instance(wrappedfactory(*args))

    def _assign_instance(self, instance):
        "Assign new wrapped instance and set update method."
        self._wrappedinstance = instance
        self.update = instance.update

    def digest(self):
        "return the current digest value"
        return self._wrappedfactory(self._wrappedinstance.digest()).digest()

    def hexdigest(self):
        "return the current digest as a string of hexadecimal digits"
        return self._wrappedfactory(self._wrappedinstance.digest()).hexdigest()

    def copy(self):
        "return a copy of the current hash object"
        new = self.__class__()
        new._assign_instance(self._wrappedinstance.copy())
        return new

    digest_size = property(lambda self: self._wrappedinstance.digest_size,
                           doc="number of bytes in this hashes output")
    digestsize = digest_size
    block_size = property(lambda self: self._wrappedinstance.block_size,
                          doc="internal block size of hash function")

class shad_256(_double_wrapper):
    """
    Double SHA - sha256(sha256(data))
    Less susceptible to length extension attacks than SHA2_256 alone.

    >>> import binascii
    >>> s = shad_256('hello world')
    >>> s.name
    'shad256'
    >>> int(s.digest_size)
    32
    >>> int(s.block_size)
    64
    >>> s.hexdigest()
    'bc62d4b80d9e36da29c16c5d4d9f11731f36052c72401a76c23c0fb5a9b74423'
    >>> binascii.hexlify(s.digest()) == s.hexdigest()
    True
    >>> s2 = s.copy()
    >>> s2.digest() == s.digest()
    True
    >>> s2.update("text")
    >>> s2.digest() == s.digest()
    False
    """
    __slots__ = ()
    def __init__(self, *args):
        super(shad_256, self).__init__(hashlib.sha256, *args)
    name = property(lambda self: 'shad256', doc='algorithm name')

这有点冗长,但是从文档的角度来看,这个类可以很好地工作,并且有一个相对清晰的实现。使用格伦的优化,update是尽可能快的。在

有一个麻烦,就是update函数显示为数据成员,并且没有docstring。我认为效率是可以接受的。在

只需使用__getattr__使所有未定义自己的属性都回到底层对象:

import hashlib

class shad_256(object):
    """
    Double SHA - sha256(sha256(data).digest())
    Less susceptible to length extension attacks than sha256 alone.

    >>> s = shad_256('hello world')
    >>> s.digest_size
    32
    >>> s.block_size
    64
    >>> s.sha256.hexdigest()
    'b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9'
    >>> s.hexdigest()
    'bc62d4b80d9e36da29c16c5d4d9f11731f36052c72401a76c23c0fb5a9b74423'
    >>> s.nonexistant()
    Traceback (most recent call last):
    ...
    AttributeError: '_hashlib.HASH' object has no attribute 'nonexistant'
    >>> s2 = s.copy()
    >>> s2.digest() == s.digest()
    True
    >>> s2.update("text")
    >>> s2.digest() == s.digest()
    False
    """
    def __init__(self, data=None):
        self.sha256 = hashlib.sha256()
        if data is not None:
            self.update(data)

    def __getattr__(self, key):
        return getattr(self.sha256, key)

    def _get_final_sha256(self):
        return hashlib.sha256(self.sha256.digest())

    def digest(self):
        return self._get_final_sha256().digest()

    def hexdigest(self):
        return self._get_final_sha256().hexdigest()

    def copy(self):
        result = shad_256()
        result.sha256 = self.sha256.copy()
        return result

if __name__ == "__main__":
    import doctest
    doctest.testmod()

这基本上消除了update调用的开销,但并不完全消除。如果要完全消除它,请将其添加到__init__(相应地在copy中):

^{pr2}$

这将消除查找update时额外的__getattr__调用。在

这一切都利用了Python成员函数中一个更有用但常常被忽视的属性:函数绑定。记住,您可以这样做:

a = "hello"
b = a.upper
b()

因为引用成员函数不会返回原始函数,而是返回该函数与对象的绑定。这就是为什么上面的__getattr__返回self.sha256.update时,返回的函数正确地操作self.sha256,而不是{}。在

创建一个新类,从对象派生,创建一个hashlib.sha256在init中定义成员变量,然后定义哈希类所需的方法,并代理到成员变量的相同方法。在

比如:

import hashlib

class MyThing(object):
    def __init__(self):
        self._hasher = hashlib.sha256()

    def digest(self):
        return self._hasher.digest()

其他方法也是如此。在

相关问题 更多 >