我有以下代码:
# -*- coding: utf-8 -*-
print u"William Burges (1827–81) was an English architect and designer."
当我试图从命令运行它时。我收到以下信息:
Traceback (most recent call last):
File "C:\Python27\utf8.py", line 3, in <module>
print u"William Burges (1827ŌĆō81) was an English architect and designer."
File "C:\Python27\lib\encodings\cp775.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\u2013' in position
20: character maps to <undefined>
如何解决此问题并使Python读取此字符?为什么Python没有用现有的代码来读取它,我认为utf-8对每个字符都有效。
谢谢你
编辑:
这段代码打印出想要的结果:
# -*- coding: utf-8 -*-
print unicode("William Burges (1827-81) was an English architect and designer.", "utf-8").encode("cp866")
但当我尝试打印多个句子时,例如:
# -*- coding: utf-8 -*-
print unicode("William Burges (1827–81) was an English architect and designer. I am here. ", "utf-8").encode("cp866")
我收到相同的错误消息:
Traceback (most recent call last):
File "C:\Python27\utf8vs.py", line 3, in <module>
print unicode("William Burges (1827ŌĆō81) was an English architect and desig
ner. I am here. ", "utf-8").encode("cp866")
File "C:\Python27\lib\encodings\cp866.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\u2013' in position
20: character maps to <undefined>
我怀疑问题出在print语句上,而不是python本身(它在我的Mac上运行得很好)。为了打印字符串,它需要将其转换为可显示的格式;在Windows命令行的默认字符集中,您使用的长划线不可显示。
你两个句子的区别不在于长度,而在于(1827-81)“vs”(1827-81)“中使用的破折号——你能看出细微的区别吗?尝试复制和粘贴一个到另一个检查这个。
另请参见Python, Unicode, and the Windows console。
实际上wiki.python.org上有一篇关于这个问题的wiki文章https://wiki.python.org/moin/PrintFails,解释了为什么使用
charmap
编解码器可能会发生这种情况。Setting the PYTHONIOENCODING environment variable as described above can be used to suppress the error messages. Setting to "utf-8" is not recommended as this produces an inaccurate, garbled representation of the output to the console. For best results, use your console's correct default codepage and a suitable error handler other than "strict".
您的字符串包含ndash sumbol。它类似于ascii减号
-
,请参见符号45 an ascii table。将ndash替换为减号,因为ascii不能包含ndash。以下工作变型:输出
相关问题 更多 >
编程相关推荐