如何猴子补丁np.savez_compressed以添加压缩级别,而不编辑numpy源文件?
我需要修改在 np.savez_compressed
中内部使用的 ZIP compressionlevel
(压缩级别)。在 Numpy 的 GitHub 上有一个 功能提案,但还没有实现。
我看到有两个选择:
修改源文件
/numpy/lib/npyio.py
,把zipf = zipfile_factory(file, mode="w", compression=compression)
替换成<idem>..., compresslevel=compresslevel)
。不过这样做的麻烦在于,每次重新安装或升级后,比如运行pip install numpy
,我都得重新修改一次,这样不是个好办法。
怎么做呢?
我尝试了第二个选项,但出现了 ValueError: seek of closed file
的错误,我不明白为什么会这样:
import numpy as np
def _savez(file, args, kwds, compress, allow_pickle=True, pickle_kwargs=None):
import zipfile
if not hasattr(file, 'write'):
file = os_fspath(file)
if not file.endswith('.npz'):
file = file + '.npz'
namedict = kwds
for i, val in enumerate(args):
key = 'arr_%d' % i
if key in namedict.keys():
raise ValueError("Cannot use un-named variables and keyword %s" % key)
namedict[key] = val
if compress:
compression = zipfile.ZIP_DEFLATED
else:
compression = zipfile.ZIP_STORED
zipf = np.lib.npyio.zipfile_factory(file, mode="w", compression=compression, compresslevel=2) # !! the only modified line !!
for key, val in namedict.items():
fname = key + '.npy'
val = np.asanyarray(val)
# always force zip64, gh-10776
with zipf.open(fname, 'w', force_zip64=True) as fid:
format.write_array(fid, val, allow_pickle=allow_pickle, pickle_kwargs=pickle_kwargs)
zipf.close()
np.lib.npyio._savez = _savez
x = np.array([1, 2, 3, 4])
with open("test.npz", "wb") as f:
np.savez_compressed(f, x=x)
1 个回答
0
我找到了一种更简单的解决办法:
import numpy as np
def zipfile_factory(file, *args, **kwargs):
if not hasattr(file, 'read'):
file = os_fspath(file)
import zipfile
kwargs['allowZip64'] = True
kwargs['compresslevel'] = 4
return zipfile.ZipFile(file, *args, **kwargs)
np.lib.npyio.zipfile_factory = zipfile_factory
with open("test.npz", "wb") as f:
np.savez_compressed(f, x=np.ones(10_000_000))
补充: 之前的解决办法:
我在这段时间里找到了答案:format
应该用 np.lib.npyio.format
来替换。现在这样就可以用了:
import numpy as np
def _savez(file, args, kwds, compress, allow_pickle=True, pickle_kwargs=None):
import zipfile
if not hasattr(file, 'write'):
file = os_fspath(file)
if not file.endswith('.npz'):
file = file + '.npz'
namedict = kwds
for i, val in enumerate(args):
key = 'arr_%d' % i
if key in namedict.keys():
raise ValueError("Cannot use un-named variables and keyword %s" % key)
namedict[key] = val
if compress:
compression = zipfile.ZIP_DEFLATED
else:
compression = zipfile.ZIP_STORED
zipf = np.lib.npyio.zipfile_factory(file, mode="w", compression=compression, compresslevel=1)
for key, val in namedict.items():
fname = key + '.npy'
val = np.asanyarray(val)
# always force zip64, gh-10776
with zipf.open(fname, 'w', force_zip64=True) as fid:
np.lib.npyio.format.write_array(fid, val, allow_pickle=allow_pickle, pickle_kwargs=pickle_kwargs)
zipf.close()
np.lib.npyio._savez = _savez
with open("test.npz", "wb") as f:
np.savez_compressed(f, x=np.array([1, 2, 3]))