根据project page of fastparquet,fastparquet
支持各种压缩方法
Optional (compression algorithms; gzip is always available):
snappy (aka python-snappy) lzo brotli lz4 zstandard
特别是^{
但是在fastparquet.write的文件中
compression to apply to each column, e.g. GZIP or SNAPPY or a dict like {"col1": "SNAPPY", "col2": None} to specify per column compression types. In both cases, the compressor settings would be the underlying compressor defaults. To pass arguments to the underlying compressor, each dict entry should itself be a dictionary:
{ col1: { "type": "LZ4", "args": { "compression_level": 6, "content_checksum": True } }, col2: { "type": "SNAPPY", "args": None } "_default": { "type": "GZIP", "args": None } }
没有提到zstandard。如果我写信,还有什么更糟糕的
fastparquet.write('outfile.parq', df, compression='LZ4')
它会弹出这样的错误
Compression 'LZ4' not available. Options: ['GZIP', 'UNCOMPRESSED']
那么fastparquest
只支持“GZIP”?这与项目页面有很大差异!我是否丢失了一些包裹?如何将fastparquest与所有项目页面声明的压缩算法一起使用
是的,您可能丢失了一些软件包。您的系统必须首先具有python LZ4和/或ZS标准绑定。有关更多详细信息,请参见the source code
对于LZ4:如果
import lz4.block
给出了一个ModuleNotFoundError
,那么继续使用pip install lz4
进行安装与zstandard类似:
pip install zstandard
对于brotli:
pip install brotlipy
和lzo:
pip install python-lzo
和敏捷:
pip install python-snappy
相关问题 更多 >
编程相关推荐