u'string'的替代品

1 投票

2 回答

797 浏览

提问于 2025-04-16 03:20

我把我的脚本保存成了UTF-8编码。

我把Windows上的代码页改成了65001。

我用的是Python 2.6。

脚本 #1

# -*- coding: utf-8 -*-
print u'Español'
x = raw_input()

脚本 #2

# -*- coding: utf-8 -*-
a = 'Español'
a.encode('utf8')
print a
x = raw_input()

脚本 #1可以正常打印出单词，没有任何错误，而脚本 #2却出错了：

UnicodeDecodeError: 'ascii' 编码无法解码位置4的字节0xf1：序号不在范围内(128)

我想在脚本 #2中动态地打印这个变量而不出错。有人告诉我，使用encode('utf8')相当于做u'string'。

显然，这并不正确，因为它会抛出错误。

大家有什么办法可以解决这个问题吗？

错误处理脚本 unicode utf-8 编码转换数据解码字符串编码

2 个回答

关于脚本 #2：

a = 'Español'           # In Python2 this is a string of bytes
a = a.decode('utf-8')   # This converts it to a unicode string
print(a)

回答于 2025-04-16 由 Python大师

分享举报

把你的代码改成下面这样：

# -*- coding: utf-8 -*-
a = 'Español'
a = a.decode('utf8')
print a
x = raw_input()

解码（Decode）是指如何读取这个字符串，并返回相应的值。按照上面的修改应该能解决你的问题。

问题在于，Python把字符串存储为字节的列表，不管文件的编码是什么。关键在于这些字节是如何被读取的，这就是我们使用 decode() 和 u'' 时所做的事情。

回答于 2025-04-16 由 Python大师

分享举报