Python中基本的LZW压缩帮助

4 投票

1 回答

24717 浏览

提问于 2025-04-16 22:18

我只是想写一个非常简单的脚本，它可以接收一些输入文本，然后用lzw算法进行压缩，使用这个包：http://packages.python.org/lzw/

我之前从来没有用Python做过任何编码的事情，现在搞得我一头雾水 =( - 我在网上找不到任何相关的文档，除了这个包的信息。

这是我目前写的代码：

import lzw

file = lzw.readbytes("collectemailinfo.txt", buffersize=1024)
enc = lzw.compress(file)
print enc

任何帮助或者建议都非常感谢！

谢谢 =)

脚本数据压缩算法文档编码 lzw压缩输入文本

1 个回答

这是这个包的API文档：http://packages.python.org/lzw/lzw-module.html

你可以在这里查看关于压缩和解压缩的伪代码。

还有什么让你感到困惑的吗？

这里有一个例子：

Python

在这个版本中，字典里包含了不同类型的数据：

def compress(uncompressed):
    """Compress a string to a list of output symbols."""

    # Build the dictionary.
    dict_size = 256
    dictionary = dict((chr(i), chr(i)) for i in xrange(dict_size))
    # in Python 3: dictionary = {chr(i): chr(i) for i in range(dict_size)}

    w = ""
    result = []
    for c in uncompressed:
        wc = w + c
        if wc in dictionary:
            w = wc
        else:
            result.append(dictionary[w])
            # Add wc to the dictionary.
            dictionary[wc] = dict_size
            dict_size += 1
            w = c

    # Output the code for w.
    if w:
        result.append(dictionary[w])
    return result



def decompress(compressed):
    """Decompress a list of output ks to a string."""

    # Build the dictionary.
    dict_size = 256
    dictionary = dict((chr(i), chr(i)) for i in xrange(dict_size))
    # in Python 3: dictionary = {chr(i): chr(i) for i in range(dict_size)}

    w = result = compressed.pop(0)
    for k in compressed:
        if k in dictionary:
            entry = dictionary[k]
        elif k == dict_size:
            entry = w + w[0]
        else:
            raise ValueError('Bad compressed k: %s' % k)
        result += entry

        # Add w+entry[0] to the dictionary.
        dictionary[dict_size] = w + entry[0]
        dict_size += 1

        w = entry
    return result

使用方法：

compressed = compress('TOBEORNOTTOBEORTOBEORNOT')
print (compressed)
decompressed = decompress(compressed)
print (decompressed)

输出结果：

['T', 'O', 'B', 'E', 'O', 'R', 'N', 'O', 'T', 256, 258, 260, 265, 259, 261, 263]
TOBEORNOTTOBEORTOBEORNOT

注意：这个例子来自这里。

回答于 2025-04-16 由 Python大师

分享举报

Python中基本的LZW压缩帮助

1 个回答

撰写回答