我有一个python文件,其中有一行文本:
The best 汉字 interests of the Aboriginal child in family law proceedings. Australian Journal of Family Law 12 140149.
我试图处理那行,但它一直抛出错误:
SyntaxError: Non-ASCII character '\xa3' in file
我把这个放在文件的最上面:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
同样的错误也会发生。你知道吗
在保留上面的utf 8标头时,还尝试了以下操作:
u'The best 汉字 interests of the Aboriginal child in family law proceedings. Australian Journal of Family Law 12 140149.'
还是一样的错误。你知道吗
在保留上面的utf 8标头时,还尝试了以下操作:
unicode(The best 汉字 interests of the Aboriginal child in family law proceedings. Australian Journal of Family Law 12 140149.)
还是一样的错误。你知道吗
在保留上面的utf 8标头时,还尝试了以下操作:
gettext.ugettext((The best 汉字 interests of the Aboriginal child in family law proceedings. Australian Journal of Family Law 12 140149.)
还是一样的错误。你知道吗
我错过了什么?你知道吗
以下是所有代码:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import re
s = u'The best 汉字 interests of the Aboriginal child in family law
proceedings. Australian Journal of Family Law 12 140149.'
def cleanup(s):
control_chars = ''.join(map(unichr, range(0,9) + range(11,13) + range(14,32) + range(127,160)))
cc_regex = re.compile('[%s]' % re.escape(control_chars))
return cc_regex.sub(' ', s)
print cleanup(s)
回溯输出:
C:\EBI\Work>cd c:\EBI\Work && cmd /C "set "PYTHONIOENCODING=UTF-8" && set "PYTHONUNBUFFERED=1" && C:/Python27/python.exe
Traceback (most recent call last):
File "c:\EBI\Work\test2.py", line 7, in <module>
s = unicode('The best µ▒ëσ¡ù interests of the Aboriginal child in family law proceedings. Australian Journal of Family Law 12 140☺149.')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 9: ordinal not in range(128)
目前没有回答
相关问题 更多 >
编程相关推荐