Python urllib2 基本认证问题

83 投票
5 回答
123678 浏览
提问于 2025-04-15 20:11

更新:根据Lee的评论,我决定把我的代码简化成一个非常简单的脚本,并从命令行运行它:

import urllib2
import sys

username = sys.argv[1]
password = sys.argv[2]
url = sys.argv[3]
print("calling %s with %s:%s\n" % (url, username, password))

passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, url, username, password)
urllib2.install_opener(urllib2.build_opener(urllib2.HTTPBasicAuthHandler(passman)))

req = urllib2.Request(url)
f = urllib2.urlopen(req)
data = f.read()
print(data)

可惜它还是没有生成Authorization头部(根据Wireshark的分析) :(

我在使用urllib2发送基本的认证时遇到了问题。我查看了这篇文章,并按照示例进行了操作。我的代码是:

passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, "api.foursquare.com", username, password)
urllib2.install_opener(urllib2.build_opener(urllib2.HTTPBasicAuthHandler(passman)))

req = urllib2.Request("http://api.foursquare.com/v1/user")    
f = urllib2.urlopen(req)
data = f.read()

通过wireshark我在网络上看到以下内容:

GET /v1/user HTTP/1.1
Host: api.foursquare.com
Connection: close
Accept-Encoding: gzip
User-Agent: Python-urllib/2.5 

你可以看到没有发送Authorization头部,而当我通过curl发送请求时却是有的:curl -u user:password http://api.foursquare.com/v1/user

GET /v1/user HTTP/1.1
Authorization: Basic =SNIP=
User-Agent: curl/7.19.4 (universal-apple-darwin10.0) libcurl/7.19.4 OpenSSL/0.9.8k zlib/1.2.3
Host: api.foursquare.com
Accept: */*

不知为什么我的代码似乎没有发送认证信息——有没有人能看出我遗漏了什么?

谢谢

-simon

5 个回答

5

这是我用来解决我在尝试访问MailChimp的API时遇到的类似问题的方法。这个方法做的事情是一样的,只是格式看起来更好一些。

import urllib2
import base64

chimpConfig = {
    "headers" : {
    "Content-Type": "application/json",
    "Authorization": "Basic " + base64.encodestring("hayden:MYSECRETAPIKEY").replace('\n', '')
    },
    "url": 'https://us12.api.mailchimp.com/3.0/'}

#perform authentication
datas = None
request = urllib2.Request(chimpConfig["url"], datas, chimpConfig["headers"])
result = urllib2.urlopen(request)
5

(复制粘贴/改编自 https://stackoverflow.com/a/24048772/1733117。)

首先,你可以创建一个新的类,继承自 urllib2.BaseHandler 或者 urllib2.HTTPBasicAuthHandler,然后实现一个叫 http_request 的方法,这样每次发送请求的时候,就能自动加上合适的 Authorization 头部信息。

import urllib2
import base64

class PreemptiveBasicAuthHandler(urllib2.HTTPBasicAuthHandler):
    '''Preemptive basic auth.

    Instead of waiting for a 403 to then retry with the credentials,
    send the credentials if the url is handled by the password manager.
    Note: please use realm=None when calling add_password.'''
    def http_request(self, req):
        url = req.get_full_url()
        realm = None
        # this is very similar to the code from retry_http_basic_auth()
        # but returns a request object.
        user, pw = self.passwd.find_user_password(realm, url)
        if pw:
            raw = "%s:%s" % (user, pw)
            auth = 'Basic %s' % base64.b64encode(raw).strip()
            req.add_unredirected_header(self.auth_header, auth)
        return req

    https_request = http_request

然后,如果你像我一样懒的话,可以把这个处理器全局安装,这样就不用每次都设置了。

api_url = "http://api.foursquare.com/"
api_username = "johndoe"
api_password = "some-cryptic-value"

auth_handler = PreemptiveBasicAuthHandler()
auth_handler.add_password(
    realm=None, # default realm.
    uri=api_url,
    user=api_username,
    passwd=api_password)
opener = urllib2.build_opener(auth_handler)
urllib2.install_opener(opener)
202

这个问题可能是因为Python的库按照HTTP标准,首先会发送一个没有身份验证的请求,只有在收到401的回应后,才会发送正确的凭证。如果Foursquare的服务器没有按照“完全标准的身份验证”来处理,那么这些库就无法正常工作。

可以尝试使用请求头来进行身份验证:

import urllib2, base64

request = urllib2.Request("http://api.foursquare.com/v1/user")
base64string = base64.b64encode('%s:%s' % (username, password))
request.add_header("Authorization", "Basic %s" % base64string)   
result = urllib2.urlopen(request)

我也遇到过和你一样的问题,最后在这个帖子中找到了解决办法:http://forums.shopify.com/categories/9/posts/27662

撰写回答