在Google App Engine获取cookie时引发UnicodeDecodeError
我有一个用Python做的GAE项目,在我的一个请求处理器中,我用以下代码设置了一个cookie:
self.response.headers['Set-Cookie'] = 'app=ABCD; expires=Fri, 31-Dec-2020 23:59:59 GMT'
我在Chrome浏览器中查看,发现cookie确实存在,所以看起来是成功的。
然后在另一个请求处理器中,我想获取这个cookie来检查一下:
appCookie = self.request.cookies['app']
执行这行代码时出现了以下错误:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1962: ordinal not in range(128)
看起来它试图用ASCII编码来解码传入的cookie信息,而不是用UTF-8。
我该如何强制Python使用UTF-8来解码这个呢?
作为一个Python和Google App Engine的新手(但在其他语言上有经验的程序员),还有没有其他与Unicode相关的注意事项我需要了解的?
以下是完整的错误追踪信息:
Traceback (most recent call last):
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/dev_appserver.py", line 4144, in _HandleRequest
self._Dispatch(dispatcher, self.rfile, outfile, env_dict)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/dev_appserver.py", line 4049, in _Dispatch
base_env_dict=env_dict)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/dev_appserver.py", line 616, in Dispatch
base_env_dict=base_env_dict)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/dev_appserver.py", line 3120, in Dispatch
self._module_dict)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/dev_appserver.py", line 3024, in ExecuteCGI
reset_modules = exec_script(handler_path, cgi_path, hook)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/dev_appserver.py", line 2887, in ExecuteOrImportScript
exec module_code in script_module.__dict__
File "/Users/ken/hgdev/juicekit/main.py", line 402, in <module>
main()
File "/Users/ken/hgdev/juicekit/main.py", line 399, in main
run_wsgi_app(application)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/util.py", line 98, in run_wsgi_app
run_bare_wsgi_app(add_wsgi_middleware(application))
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/util.py", line 116, in run_bare_wsgi_app
result = application(env, _start_response)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/__init__.py", line 721, in __call__
response.wsgi_write(start_response)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/__init__.py", line 296, in wsgi_write
body = self.out.getvalue()
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/StringIO.py", line 270, in getvalue
self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1962: ordinal not in range(128)
2 个回答
0
首先,设置在 cookies 中的任何 unicode 值时,要进行编码。你还需要加上引号,以防这些值破坏了头部信息:
import urllib
# This is the value we want to set.
initial_value = u'äëïöü'
# WebOb version that comes with SDK doesn't quote cookie values
# in the Response, neither webapp.Response. So we have to do it.
quoted_value = urllib.quote(initial_value.encode('utf-8'))
rsp = webapp.Response()
rsp.headers['Set-Cookie'] = 'app=%s; Path=/' % quoted_value
接下来,我们来读取这个值。为了测试它,创建一个假的 Request
来测试我们设置的 cookie。这个代码是从一个真实的单元测试中提取出来的:
cookie = rsp.headers.get('Set-Cookie')
req = webapp.Request.blank('/', headers=[('Cookie', cookie)])
# The stored value is the same quoted value from before.
# Notice that here we use .str_cookies, not .cookies.
stored_value = req.str_cookies.get('app')
self.assertEqual(stored_value, quoted_value)
我们的值仍然是编码过的并且加了引号。我们需要把它们反过来处理,才能得到最初的值:
# And we can get the initial value unquoting and decoding.
final_value = urllib.unquote(stored_value).decode('utf-8')
self.assertEqual(final_value, initial_value)
如果可以的话,建议使用 webapp2。webob.Response
可以帮你处理所有关于引号和设置 cookies 的繁琐工作,而且你可以直接设置 unicode 值。有关这些问题的总结,可以在 这里 查看。
0
你想用 decode
函数大概是这样的(感谢 @agf 的分享):
self.request.cookies['app'].decode('utf-8')
来自官方的 Python 文档(还有一些补充的细节):
Python 的 8 位字符串有一个 .decode([编码], [错误处理]) 方法,这个方法可以用你指定的编码来解释这个字符串。下面的例子展示了字符串是如何转换成 Unicode,然后再转换回 8 位字符串的:
>>> u = unichr(40960) + u'abcd' + unichr(1972) # Assemble a string
>>> type(u), u # Examine
(<type 'unicode'>, u'\ua000abcd\u07b4')
>>> utf8_version = u.encode('utf-8') # Encode as UTF-8
>>> type(utf8_version), utf8_version # Examine
(<type 'str'>, '\xea\x80\x80abcd\xde\xb4')
>>> u2 = utf8_version.decode('utf-8') # Decode using UTF-8
>>> u == u2 # The two strings match
True