有没有更快的方法将任意大整数转换为大端字节序列？

8 投票

4 回答

7171 浏览

提问于 2025-04-16 08:04

我有一段Python代码来实现这个功能：

from struct import pack as _pack

def packl(lnum, pad = 1):
    if lnum < 0:
        raise RangeError("Cannot use packl to convert a negative integer "
                         "to a string.")
    count = 0
    l = []
    while lnum > 0:
        l.append(lnum & 0xffffffffffffffffL)
        count += 1
        lnum >>= 64
    if count <= 0:
        return '\0' * pad
    elif pad >= 8:
        lens = 8 * count % pad
        pad = ((lens != 0) and (pad - lens)) or 0
        l.append('>' + 'x' * pad + 'Q' * count)
        l.reverse()
        return _pack(*l)
    else:
        l.append('>' + 'Q' * count)
        l.reverse()
        s = _pack(*l).lstrip('\0')
        lens = len(s)
        if (lens % pad) != 0:
            return '\0' * (pad - lens % pad) + s
        else:
            return s

在我的电脑上，把 2**9700 - 1 转换成字节字符串大约需要174微秒。如果我愿意使用Python 2.7和Python 3.x特有的 bit_length 方法，我可以通过在一开始就把 l 数组预先分配成准确的大小，并使用 l[something] = 这种写法，而不是 l.append，把时间缩短到159微秒。

有没有什么方法可以让这个过程更快呢？这个方法将用于转换在加密中使用的大素数，还有一些（但不多）较小的数字。

编辑

在Python 3.2之前，这个方法是目前最快的选择，所需时间大约是被接受答案的一半：

def packl(lnum, padmultiple=1):
    """Packs the lnum (which must be convertable to a long) into a
       byte string 0 padded to a multiple of padmultiple bytes in size. 0
       means no padding whatsoever, so that packing 0 result in an empty
       string.  The resulting byte string is the big-endian two's
       complement representation of the passed in long."""

    if lnum == 0:
        return b'\0' * padmultiple
    elif lnum < 0:
        raise ValueError("Can only convert non-negative numbers.")
    s = hex(lnum)[2:]
    s = s.rstrip('L')
    if len(s) & 1:
        s = '0' + s
    s = binascii.unhexlify(s)
    if (padmultiple != 1) and (padmultiple != 0):
        filled_so_far = len(s) % padmultiple
        if filled_so_far != 0:
            s = b'\0' * (padmultiple - filled_so_far) + s
    return s

def unpackl(bytestr):
    """Treats a byte string as a sequence of base 256 digits
    representing an unsigned integer in big-endian format and converts
    that representation into a Python integer."""

    return int(binascii.hexlify(bytestr), 16) if len(bytestr) > 0 else 0

在Python 3.2中，int 类有 to_bytes 和 from_bytes 函数，可以更快地完成这个任务，比上面提到的方法要快得多。

性能优化时间复杂度加密算法数据转换字节字符串大整数数组预分配大端字节序列

4 个回答

我觉得你最好还是用numpy，这个库里面肯定有现成的功能可以用。而且用array模块可能会更快一些。不过我还是试着给你讲讲。

根据我的经验，创建一个生成器，然后用列表推导式或者内置的求和功能，通常比用循环一个一个地往列表里加东西要快，因为加东西的过程可以在内部完成。而且，对于一个很大的字符串来说，使用'lstrip'这个操作肯定是比较耗费资源的。

还有一些风格上的建议：特殊情况并没有那么特殊；而且你似乎没有注意到新的x if y else z这种写法。:) 不过其实我们也不一定需要用到它。;)

from struct import pack as _pack


Q_size = 64
Q_bitmask = (1L << Q_size) - 1L


def quads_gen(a_long):
    while a_long:
        yield a_long & Q_bitmask
        a_long >>= Q_size


def pack_long_big_endian(a_long, pad = 1):
    if lnum < 0:
        raise RangeError("Cannot use packl to convert a negative integer "
                         "to a string.")
    qs = list(reversed(quads_gen(a_long)))
    # Pack the first one separately so we can lstrip nicely.
    first = _pack('>Q', qs[0]).lstrip('\x00')
    rest = _pack('>%sQ' % len(qs) - 1, *qs[1:])
    count = len(first) + len(rest)
    # A little math trick that depends on Python's behaviour of modulus
    # for negative numbers - but it's well-defined and documented
    return '\x00' * (-count % pad) + first + rest

回答于 2025-04-16 由 Python大师

分享举报

这里有一个解决方案，它通过 ctypes 调用 Python/C API。目前这个方案使用了 NumPy，但如果不能使用 NumPy，也可以仅仅用 ctypes 来实现。

import numpy
import ctypes
PyLong_AsByteArray = ctypes.pythonapi._PyLong_AsByteArray
PyLong_AsByteArray.argtypes = [ctypes.py_object,
                               numpy.ctypeslib.ndpointer(numpy.uint8),
                               ctypes.c_size_t,
                               ctypes.c_int,
                               ctypes.c_int]

def packl_ctypes_numpy(lnum):
    a = numpy.zeros(lnum.bit_length()//8 + 1, dtype=numpy.uint8)
    PyLong_AsByteArray(lnum, a, a.size, 0, 1)
    return a

在我的电脑上，这个方法比你提到的方案快了15倍。

编辑：这是使用 ctypes 仅实现的相同代码，并且返回的是一个字符串，而不是 NumPy 数组：

import ctypes
PyLong_AsByteArray = ctypes.pythonapi._PyLong_AsByteArray
PyLong_AsByteArray.argtypes = [ctypes.py_object,
                               ctypes.c_char_p,
                               ctypes.c_size_t,
                               ctypes.c_int,
                               ctypes.c_int]

def packl_ctypes(lnum):
    a = ctypes.create_string_buffer(lnum.bit_length()//8 + 1)
    PyLong_AsByteArray(lnum, a, len(a), 0, 1)
    return a.raw

这个方法又快了两倍，总体上在我的电脑上速度提升了30倍。

回答于 2025-04-16 由 Python大师

分享举报

为了让这个问题更加完整，也为了将来阅读这个问题的人：

从Python 3.2开始，新增了两个函数：int.from_bytes()和int.to_bytes()。这两个函数可以在不同的字节顺序下，将bytes（字节）和int（整数）对象进行相互转换。

回答于 2025-04-16 由 Python大师

分享举报

有没有更快的方法将任意大整数转换为大端字节序列？

4 个回答

撰写回答