使用Python urllib2进行带XML负载的认证HTTP POST

6 投票
1 回答
3631 浏览
提问于 2025-04-16 00:43

我正在尝试用 IronPython 的 urllib2 发送一个纯 XML 的 POST 请求(我觉得是这样)。但是,每次发送时,它都会返回错误代码 400(错误请求)。

其实,我是想模仿 Boxee 的一个移除队列项的调用,实际的数据包看起来是这样的(来自 WireShark):

POST /action/add HTTP/1.1
User-Agent: curl/7.16.3 (Windows  build 7600; en-US; beta) boxee/0.9.21.11487
Host: app.boxee.tv
Accept: */*
Accept-Encoding: deflate, gzip
Cookie: boxee_ping_version=9; X-Mapping-oompknoc=76D730BC9E858725098BF13AEFE32EB5; boxee_app=e01e36e85d368d4112fe4d1b6587b1fd
Connection: keep-alive
Content-Type: text/xml
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Accept-Language: en-us,en;q=0.5
Keep-Alive: 300
Connection: keep-alive
Content-Length: 53

<message type="dequeue" referral="3102296"></message>

我用以下的 Python 代码来发送这个 POST 请求:

def PostProtectedPage(theurl, username, password, postdata):

    req = urllib2.Request(theurl, data=postdata)
    req.add_header('Content-Type', 'text/xml')
    try:
        handle = urllib2.urlopen(req)
    except IOError, e:                  # here we are assuming we fail
        pass
    else:                               # If we don't fail then the page isn't protected
        print "This page isn't protected by authentication."
        sys.exit(1)

    if not hasattr(e, 'code') or e.code != 401:                 # we got an error - but not a 401 error
        print "This page isn't protected by authentication."
        print 'But we failed for another reason.'
        sys.exit(1)

    authline = e.headers.get('www-authenticate', '')                # this gets the www-authenticat line from the headers - which has the authentication scheme and realm in it
    if not authline:
        print 'A 401 error without an authentication response header - very weird.'
        sys.exit(1)

    authobj = re.compile(r'''(?:\s*www-authenticate\s*:)?\s*(\w*)\s+realm=['"](\w+)['"]''', re.IGNORECASE)          # this regular expression is used to extract scheme and realm
    matchobj = authobj.match(authline)
    if not matchobj:                                        # if the authline isn't matched by the regular expression then something is wrong
        print 'The authentication line is badly formed.'
        sys.exit(1)
    scheme = matchobj.group(1) 
    realm = matchobj.group(2)
    if scheme.lower() != 'basic':
        print 'This example only works with BASIC authentication.'
        sys.exit(1)

    base64string = base64.encodestring('%s:%s' % (username, password))[:-1]
    authheader =  "Basic %s" % base64string
    req.add_header("Authorization", authheader)
    try:
        handle = urllib2.urlopen(req)
    except IOError, e:                  # here we shouldn't fail if the username/password is right
        print "It looks like the username or password is wrong."
        print e
        sys.exit(1)
    thepage = handle.read()
    return thepage

然而,每当我运行这个代码时,它都会返回错误 400(错误请求)。
我知道身份验证是正确的,因为我在其他地方用它来获取队列(我想不出它不被使用的情况,否则它怎么知道要对哪个账户进行更改呢?)

查看网络捕获,我是不是只是忘了在请求中添加一些头信息?可能是一些简单的东西,但我对 Python 或 HTTP 请求了解得不够,不知道该怎么处理。

编辑:顺便说一下,我是这样调用代码的(实际上是动态的,但这是基本思路):

PostProtectedPage("http://app.boxee.tv/action/add", "user", "pass", "<message type=\"dequeue\" referral=\"3102296\"></message>")

1 个回答

0

这对我来说运行得很好:

curl -v -A 'curl/7.16.3 (Windows  build 7600; en-US; beta) boxee/0.9.21.11487' \
 -H 'Content-Type: text/xml' -u "USER:PASS" \
 --data '<message type="dequeue" referral="12573293"></message>' \
 'http://app.boxee.tv/action/add'

但是如果我尝试删除一个当前不在队列中的推荐ID,就会出现 400 Bad Request 的错误。如果你使用的是从Wireshark检测到的相同推荐ID,很可能你也会遇到这个问题。请使用

wget -nv -m -nd --user=USER --password=PASS http://app.boxee.tv/api/get_queue

来确认你想要删除的内容确实在队列中。

撰写回答