用Python解析HTTP请求的Authorization头部
我需要处理一个像这样的头部信息:
Authorization: Digest qop="chap",
realm="testrealm@host.com",
username="Foobear",
response="6629fae49393a05397450978507c4ef1",
cnonce="5ccc069c403ebaf9f0171e9517f40e41"
然后用Python把它解析成这样:
{'protocol':'Digest',
'qop':'chap',
'realm':'testrealm@host.com',
'username':'Foobear',
'response':'6629fae49393a05397450978507c4ef1',
'cnonce':'5ccc069c403ebaf9f0171e9517f40e41'}
有没有什么库可以做到这一点,或者有什么东西可以给我一些灵感?
我在Google App Engine上做这个,不太确定Pyparsing库是否可用,但如果它是最好的解决方案,我也许可以把它包含在我的应用里。
目前我正在创建自己的MyHeaderParser对象,并用reduce()函数处理头部字符串。这个方法可以用,但很脆弱。
下面是nadia的精彩解决方案:
import re
reg = re.compile('(\w+)[=] ?"?(\w+)"?')
s = """Digest
realm="stackoverflow.com", username="kixx"
"""
print str(dict(reg.findall(s)))
10 个回答
3
这是我用pyparsing写的代码:
text = """Authorization: Digest qop="chap",
realm="testrealm@host.com",
username="Foobear",
response="6629fae49393a05397450978507c4ef1",
cnonce="5ccc069c403ebaf9f0171e9517f40e41" """
from pyparsing import *
AUTH = Keyword("Authorization")
ident = Word(alphas,alphanums)
EQ = Suppress("=")
quotedString.setParseAction(removeQuotes)
valueDict = Dict(delimitedList(Group(ident + EQ + quotedString)))
authentry = AUTH + ":" + ident("protocol") + valueDict
print authentry.parseString(text).dump()
运行后输出:
['Authorization', ':', 'Digest', ['qop', 'chap'], ['realm', 'testrealm@host.com'],
['username', 'Foobear'], ['response', '6629fae49393a05397450978507c4ef1'],
['cnonce', '5ccc069c403ebaf9f0171e9517f40e41']]
- cnonce: 5ccc069c403ebaf9f0171e9517f40e41
- protocol: Digest
- qop: chap
- realm: testrealm@host.com
- response: 6629fae49393a05397450978507c4ef1
- username: Foobear
我对RFC不太了解,但希望这能帮助你入门。
10
你也可以像[CheryPy][1]那样使用urllib2。
下面是代码片段:
input= """
Authorization: Digest qop="chap",
realm="testrealm@host.com",
username="Foobear",
response="6629fae49393a05397450978507c4ef1",
cnonce="5ccc069c403ebaf9f0171e9517f40e41"
"""
import urllib2
field, sep, value = input.partition("Authorization: Digest ")
if value:
items = urllib2.parse_http_list(value)
opts = urllib2.parse_keqv_list(items)
opts['protocol'] = 'Digest'
print opts
它的输出结果是:
{'username': 'Foobear', 'protocol': 'Digest', 'qop': 'chap', 'cnonce': '5ccc069c403ebaf9f0171e9517f40e41', 'realm': 'testrealm@host.com', 'response': '6629fae49393a05397450978507c4ef1'}
[1]: https://web.archive.org/web/20130118133623/http://www.google.com:80/codesearch/p?hl=en#OQvO9n2mc04/CherryPy-3.0.1/cherrypy/lib/httpauth.py&q=Authorization 摘要 http 语言:python
14
这里有一点正则表达式的内容:
import re
reg=re.compile('(\w+)[:=] ?"?(\w+)"?')
>>>dict(reg.findall(headers))
{'username': 'Foobear', 'realm': 'testrealm', 'qop': 'chap', 'cnonce': '5ccc069c403ebaf9f0171e9517f40e41', 'response': '6629fae49393a05397450978507c4ef1', 'Authorization': 'Digest'}