Python LZMA在尝试解压缩时出现损坏的数据错误

2024-06-17 09:44:47 发布

您现在位置:Python中文网/ 问答频道 /正文

response = requests.get('http://content.warframe.com/PublicExport/index_en.txt.lzma')
data = lzma.decompress(response.content)

我得到的错误是:

_lzma.LZMAError: Corrupt input data

我不认为数据被破坏了,因为我可以从浏览器下载数据,然后用7zip很好地提取出来。我试图在网上找到一个解决方案,但似乎没有很多关于这个问题的信息。我也尝试过用一种不同的方法来解压,但运气不好。(Python LZMA : Compressed data ended before the end-of-stream marker was reached

编辑:这是“有效”的当前解决方案。基本上,切掉末端,忽略EOF错误

def fix():
    response = requests.get('http://content.warframe.com/PublicExport/index_en.txt.lzma')
    data = response.content
    byt = bytes(data)
    length = len(data)
    stay = True
    while stay:
        stay = False
        try:
            decompress_lzma(byt[0:length])
        except LZMAError:
            length -= 1
            stay = True

    print(decompress_lzma(byt[0:length]))

# FROM: https://stackoverflow.com/a/37400585/15041587
def decompress_lzma(data):
    results = []
    while True:
        decomp = LZMADecompressor(FORMAT_AUTO, None, None)
        try:
            res = decomp.decompress(data)
        except LZMAError:
            if results:
                break  # Leftover data is not a valid LZMA/XZ stream; ignore it.
            else:
                raise  # Error on the first iteration; bail out.
        results.append(res)
        data = decomp.unused_data
        if not data:
            break
        if not decomp.eof:
            raise LZMAError("Compressed data ended before the end-of-stream marker was reached")
    return b"".join(results)

Tags: thecomtruedatastreamresponsecontentdecompress
1条回答
网友
1楼 · 发布于 2024-06-17 09:44:47

我还可以用7zip打开文件。但是在尝试用xz解压上面的链接文件并看到

$ xz  format=lzma  decompress -t index_en.txt.lzma
xz: index_en.txt.lzma: Compressed data is corrupt

我不完全确定,但我怀疑文件实际上可能在某种程度上已损坏或不标准,也就是说,7zip能够成功解压缩此文件的方式并不常见

为了进一步支持这一点,如果我通过xz创建一个新的LZMA文件,例如

xz  format=lzma  compress -k <file>

尝试用Python中的^{}解压并读取该文件,它可以正常工作

相关问题 更多 >