在python中从gzip文件中读取utf-8字符

import string import gzip import codecs f = gzip.open('file.gz','r') engines = {} line = f.readline() while line: parsed = string.split(line, u'\u0001') #do some things... line = f.readline() for en in engines: print(en)

2条回答

网友

1楼 · 编辑于 2024-05-23 17:25:15

也许

import codecs
zf = gzip.open(fname, 'rb')
reader = codecs.getreader("utf-8")
contents = reader( zf )
for line in contents:
    pass

网友

2楼 · 编辑于 2024-05-23 17:25:15

我不明白为什么这么难。

你到底在干什么？请解释“最终它读到一个无效字符”。

应该简单到：

import gzip
fp = gzip.open('foo.gz')
contents = fp.read() # contents now has the uncompressed bytes of foo.gz
fp.close()
u_str = contents.decode('utf-8') # u_str is now a unicode string

编辑

这个答案对Python3中的Python2有效，请参见https://stackoverflow.com/a/19794943/610569上的@SeppoEnarvi的答案（它使用rt模式进行gzip.open）。

编程相关推荐

java Jetty是否有请求缓存？
数组中的java 2值与我的数据帧中的2列对应
对象序列化期间的java DbUtils类型转换问题
java根面板中不显示所有单独的面板
java通过代理或SSH隧道连接Hbase API
java困惑：与经典MVC控制器相比，JSF2中bean的角色
java在我的Triangle类中“找不到符号错误”
java在Android中设置从路径到自定义按钮的图像
java不绕轨道旋转椭圆
AES在socket上搞砸了序列化/反序列化。无效的流标头。JAVA

编辑

相关问题更多 >

编程相关推荐

热门问题

热门文章

在python中从gzip文件中读取utf-8字符

编辑

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >