Python中基本的LZW压缩帮助
我只是想写一个非常简单的脚本,它可以接收一些输入文本,然后用lzw算法进行压缩,使用这个包:http://packages.python.org/lzw/
我之前从来没有用Python做过任何编码的事情,现在搞得我一头雾水 =( - 我在网上找不到任何相关的文档,除了这个包的信息。
这是我目前写的代码:
import lzw
file = lzw.readbytes("collectemailinfo.txt", buffersize=1024)
enc = lzw.compress(file)
print enc
任何帮助或者建议都非常感谢!
谢谢 =)
1 个回答
12
这是这个包的API文档:http://packages.python.org/lzw/lzw-module.html
你可以在这里查看关于压缩和解压缩的伪代码。
还有什么让你感到困惑的吗?
这里有一个例子:
Python
在这个版本中,字典里包含了不同类型的数据:
def compress(uncompressed):
"""Compress a string to a list of output symbols."""
# Build the dictionary.
dict_size = 256
dictionary = dict((chr(i), chr(i)) for i in xrange(dict_size))
# in Python 3: dictionary = {chr(i): chr(i) for i in range(dict_size)}
w = ""
result = []
for c in uncompressed:
wc = w + c
if wc in dictionary:
w = wc
else:
result.append(dictionary[w])
# Add wc to the dictionary.
dictionary[wc] = dict_size
dict_size += 1
w = c
# Output the code for w.
if w:
result.append(dictionary[w])
return result
def decompress(compressed):
"""Decompress a list of output ks to a string."""
# Build the dictionary.
dict_size = 256
dictionary = dict((chr(i), chr(i)) for i in xrange(dict_size))
# in Python 3: dictionary = {chr(i): chr(i) for i in range(dict_size)}
w = result = compressed.pop(0)
for k in compressed:
if k in dictionary:
entry = dictionary[k]
elif k == dict_size:
entry = w + w[0]
else:
raise ValueError('Bad compressed k: %s' % k)
result += entry
# Add w+entry[0] to the dictionary.
dictionary[dict_size] = w + entry[0]
dict_size += 1
w = entry
return result
使用方法:
compressed = compress('TOBEORNOTTOBEORTOBEORNOT')
print (compressed)
decompressed = decompress(compressed)
print (decompressed)
输出结果:
['T', 'O', 'B', 'E', 'O', 'R', 'N', 'O', 'T', 256, 258, 260, 265, 259, 261, 263]
TOBEORNOTTOBEORTOBEORNOT
注意:这个例子来自这里。