Python 3-pickle可以处理大于4GB的字节对象吗?

2024-04-19 04:24:19 发布

您现在位置:Python中文网/ 问答频道 /正文

基于这个comment和参考的文档,Python 3.4+中的Pickle 4.0+应该能够Pickle大于4gb的字节对象。

但是,在Mac OS X 10.10.4上使用python 3.4.3或python 3.5.0b2时,当我尝试pickle大字节数组时出现错误:

>>> import pickle
>>> x = bytearray(8 * 1000 * 1000 * 1000)
>>> fp = open("x.dat", "wb")
>>> pickle.dump(x, fp, protocol = 4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: [Errno 22] Invalid argument

我的代码中有错误吗?还是我误解了文档?


Tags: 对象文档import字节osmac错误comment
1条回答
网友
1楼 · 发布于 2024-04-19 04:24:19

这里是完整的解决方法,尽管pickle.load似乎不再试图转储一个大文件(我在Python 3.5.2上),所以严格地说,只有pickle.dumps需要这样才能正常工作。

import pickle

class MacOSFile(object):

    def __init__(self, f):
        self.f = f

    def __getattr__(self, item):
        return getattr(self.f, item)

    def read(self, n):
        # print("reading total_bytes=%s" % n, flush=True)
        if n >= (1 << 31):
            buffer = bytearray(n)
            idx = 0
            while idx < n:
                batch_size = min(n - idx, 1 << 31 - 1)
                # print("reading bytes [%s,%s)..." % (idx, idx + batch_size), end="", flush=True)
                buffer[idx:idx + batch_size] = self.f.read(batch_size)
                # print("done.", flush=True)
                idx += batch_size
            return buffer
        return self.f.read(n)

    def write(self, buffer):
        n = len(buffer)
        print("writing total_bytes=%s..." % n, flush=True)
        idx = 0
        while idx < n:
            batch_size = min(n - idx, 1 << 31 - 1)
            print("writing bytes [%s, %s)... " % (idx, idx + batch_size), end="", flush=True)
            self.f.write(buffer[idx:idx + batch_size])
            print("done.", flush=True)
            idx += batch_size


def pickle_dump(obj, file_path):
    with open(file_path, "wb") as f:
        return pickle.dump(obj, MacOSFile(f), protocol=pickle.HIGHEST_PROTOCOL)


def pickle_load(file_path):
    with open(file_path, "rb") as f:
        return pickle.load(MacOSFile(f))

相关问题 更多 >