这段代码有用吗？

Question

def _oauth_escape(val):
    if isinstance(val, unicode):# useful ?
        val = val.encode("utf-8")#useful ?
    return urllib.quote(val, safe="~")

我觉得这没什么用，

是吗？？

更新

我觉得unicode就是‘utf-8’，对吧？

Answer 1

在Python 3.0中，所有的字符串都支持Unicode，但在之前的版本中，必须明确地将字符串转换成Unicode字符串。这样说是不是有道理呢？

(utf-8并不是唯一的编码方式，但它是最常用的Unicode编码。可以看看这篇文章。）

Answer 2

正如其他人所说，unicode和utf-8并不是一回事。utf-8是unicode的一种编码方式。

可以把unicode对象想象成“未编码”的unicode字符串，而string对象则是用特定的编码方式编码过的（不幸的是，string对象没有属性可以告诉你它是用什么编码的）。

val.encode("utf-8")这个操作会把一个unicode对象转换成一个utf-8编码的字符串对象。

在Python 2.6中，这是必要的，因为urllib无法正确处理unicode。

>>> import urllib
>>> urllib.quote(u"")
''
>>> urllib.quote(u"ä")
/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib.py:1216: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  res = map(safe_map.__getitem__, s)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib.py", line 1216, in quote
    res = map(safe_map.__getitem__, s)
KeyError: u'\xe4'
>>> urllib.quote(u"ä".encode("utf-8"))
'%C3%A4'

然而在Python 3.x中，所有字符串都是unicode（Python 3中编码字符串的对应物是bytes对象），所以就不再需要这样做了。

>>> import urllib.parse
>>> urllib.parse.quote("ä")
'%C3%A4'

Answer 3

UTF-8是一种编码方式，简单来说就是把Unicode数据转换成一串字节的一个方法。这是很多编码方式中的一种。Python中的str对象是字节串，可以用来表示任意的二进制数据，比如用特定编码表示的文本。

Python的Unicode类型是一种抽象的、未编码的文本表示方式。Unicode字符串可以用多种编码方式进行编码。

这段代码有用吗？

3 个回答

撰写回答