Python2.7：使用“lzma”modu压缩XZ格式的数据

import lzma as xz in_file = open('/home/ki2ne/Desktop/song.wav', 'rb') input_data = in_file.read() compressed_data = xz.compress(input_data) out_file = open('/home/ki2ne/Desktop/song.wav.xz', 'wb') in_file.close() out_file.close()

# I have created two identical text files with some random phrases from subprocess import call from hashlib import sha256 from backports import lzma as xz f2 = open("test2.txt" , 'rb') f2_buf = buffer(f2.read()) call(["xz", "test1.txt"]) f2_xzbuf = buffer(xz.compress(f2_buf)) f1 = open("test1.txt.xz", 'rb') f1_xzbuf = buffer(f1.read()) f1.close(); f2.close() f1sum = sha256(); f2sum = sha256() f1sum.update(f1_xzbuf); f2sum.update(f2_xzbuf) if f1sum.hexdigest() == f2sum.hexdigest(): print "Checksums OK" else: print "Checksum Error"

2条回答

网友

1楼 · 编辑于 2024-06-16 14:31:39

在我的例子中（Ubuntu/Mint），为了在Pyhton 2.7中使用lzma模块，我直接用pip（我没有使用github）、用sudo或root用户安装了backports.lzma：

pip2 install backports.lzma

FYIpip2具有不需要超级用户权限的--user选项，并且仅为本地用户安装模块，但我尚未对此进行测试。

首先，除了执行pip安装之外，还必须使用包管理器安装一个强制依赖关系：库liblzma。

在我的例子中，包名是liblzma5和liblzma-dev，但是不同的Linux发行版/发行版的包名可能不同。

p.s:我还在不同的Linux环境（未知的群集发行版）上使用conda成功地重复了相同的操作：

conda install backports
conda install backports.lzma --name pyEnvName

希望有用

网友

2楼 · 编辑于 2024-06-16 14:31:39

我不担心压缩文件中的差异-根据容器格式和.xz文件中使用的校验和类型，压缩数据可能会在不影响内容的情况下发生变化。

编辑我进一步研究了这个问题，并编写了这个脚本来测试PyLZMA Python2.x模块和lzma Python3.x内置模块

from __future__ import print_function
try:
    import lzma as xz
except ImportError:
    import pylzma as xz
import os

# compress with xz command line util
os.system('xz -zkf test.txt')

# now compress with lib
with open('test.txt', 'rb') as f, open('test.txt.xzpy', 'wb') as out:
    out.write(xz.compress(bytes(f.read())))

# compare the two files
from hashlib import md5

with open('test.txt.xz', 'rb') as f1, open('test.txt.xzpy', 'rb') as f2:
    hash1 = md5(f1.read()).hexdigest()
    hash2 = md5(f2.read()).hexdigest() 
    print(hash1, hash2)
    assert hash1 == hash2

这将使用xz命令行实用程序和Python模块压缩文件test.txt，并比较结果。在Python3下，lzma生成的结果与xz相同，但是在Python2下，PyLZMA生成的结果不同，无法使用xz命令行util提取。

在Python2中，您使用什么模块称为“lzma”，您使用什么命令来压缩数据？

编辑2好的，我找到Python2的pyliblzma模块。然而，它似乎使用CRC32作为默认的校验和算法（其他的使用CRC64），并且有一个错误阻止更改校验和算法https://bugs.launchpad.net/pyliblzma/+bug/1243344

您可以尝试使用xz -C crc32压缩来比较结果，但是我仍然没有成功地使用Python2库生成有效的压缩文件。

相关问题更多 >

编程相关推荐

热门问题

热门文章