当我读到一个包含汉字的文件时，为什么会出现UnicodeDecodeError错误？

>>> path = 'name.txt' >>> content = None >>> with open(path, 'r') as file: ... content = file.readlines() ... Traceback (most recent call last): File "<stdin>", line 2, in <module> File "/mnt/lustre/share/miniconda3/lib/python3.6/encodings/ascii.py", line 26, in decode return codecs.ascii_decode(input, self.errors)[0] UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 163: ordinal not in range(128)

2条回答

网友

1楼 · 编辑于 2024-04-26 05:33:20

默认情况下，Python2.7将文件读入字节字符串

默认情况下，Python3.x将文件读入Unicode字符串，因此必须对文件中的字节进行解码

使用的默认编码因操作系统而异，但可以通过调用locale.getpreferredencoding(False)来确定。这在Linux系统上通常是utf8，但Windows系统返回本地化的ANSI编码，例如，对于美国/西欧Windows版本cp1252

在Python3中，指定您期望的文件编码，以便不依赖于特定于语言环境的默认值。例如：

with open(path,'r',encoding='utf8') as f:
    ...

您也可以在Python2中这样做，但是使用io.open()，它与Python3的open()兼容，并且将读取Unicode字符串而不是字节字符串io.open()在Python3中也可用于可移植性

网友

2楼 · 编辑于 2024-04-26 05:33:20

^{}正在使用ASCII编解码器尝试读取文件。解决此问题的最简单方法是指定编码：

with open(path, 'r', encoding='utf-8') as file:

您的locale should probably specify将preferred encoding转换为UTF-8，但我认为这取决于操作系统和语言设置

相关问题更多 >

编程相关推荐

热门问题

热门文章