Python HttpConnection 请求编码错误

0 投票
1 回答
2246 浏览
提问于 2025-04-17 03:58

我正在尝试用一个Python脚本把一个zip文件上传到一个网站。这个网站提供了一个API,专门用于这个目的。但是,当我尝试使用这个API时,在把所有要发送的字符串组合起来时,出现了编码错误。我追踪到出问题的字符串是filedata(我的zip文件)。

Traceback (most recent call last):
File "/Library/Application Junk/ProjectManager/Main.py", line 146, in OnUpload CurseUploader.upload_file('77353ba57bdeb5346d1b3830ed36171279763e35', 'wow', slug, version, VersionID, 'r', logText or '', 'creole', '', 'plain', zipPath)
File "/Library/Application Junk/ProjectManager/CurseUploader.py", line 83, in upload_file
content_type, body = encode_multipart_formdata(params, [('file', filepath)])
File "/Library/Application Junk/ProjectManager/CurseUploader.py", line 153, in encode_multipart_formdata
body = '\r\n'.join(L)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xcb in position 10: ordinal not in range(128)

编辑:应要求,提供完整代码。

编辑2:按照建议,我尝试把所有非ASCII字符串编码成ASCII格式。结果出现了同样的错误,不过这次是在L[i] = value.encode("ascii")这一行。

from httplib import HTTPConnection
from os.path import basename, exists
from mimetools import choose_boundary

try:
    import simplejson as json
except ImportError:
    import json

def get_game_versions(game):
    """
    Return the JSON response as given from /game-versions.json from curseforge.com of the given game

`game`
    The shortened version of the game, e.g. "wow", "war", or "rom"
"""
conn = HTTPConnection('%(game)s.curseforge.com' % { 'game': game })
conn.request("GET", '/game-versions.json')
response = conn.getresponse()
assert response.status == 200, "%(status)d %(reason)s from /game-versions.json" % { 'status': response.status, 'reason': response.reason }

assert response.content_type == 'application/json'
data = json.loads(response.read())

return data

def upload_file(api_key, game, project_slug, name, game_version_ids, file_type, change_log, change_markup_type, known_caveats, caveats_markup_type, filepath):
"""
Upload a file to CurseForge.com on your project

`api_key`
    The api-key from http://www.curseforge.com/home/api-key/

`game`
    The shortened version of the game, e.g. "wow", "war", or "rom"

`project_slug`
    The slug of your project, e.g. "my-project"

`name`
    The name of the file you're uploading, this should be the version's name, do not include your project's name.

`game_version_ids`
    A set of game version ids.

`file_type`
    Specify 'a' for Alpha, 'b' for Beta, and 'r' for Release.

`change_log`
    The change log of the file. Up to 50k characters is acceptable.

`change_markup_type`
    Markup type for your change log. creole or plain is recommended.

`known_caveats`
    The known caveats of the file. Up to 50k characters is acceptable.

`caveats_markup_type`
    Markup type for your known caveats. creole or plain is recommended.

`filepath`
    The path to the file to upload.
"""

assert len(api_key) == 40
assert 1 <= len(game_version_ids) <= 3
assert file_type in ('r', 'b', 'a')
assert exists(filepath)

params = []

params.append(('name', name))

for game_version_id in game_version_ids:
    params.append(('game_version', game_version_id))

params.append(('file_type', file_type))
params.append(('change_log', change_log))
params.append(('change_markup_type', change_markup_type))
params.append(('known_caveats', known_caveats))
params.append(('caveats_markup_type', caveats_markup_type))

content_type, body = encode_multipart_formdata(params, [('file', filepath)])
print('Got here?')


headers = {
    "User-Agent": "CurseForge Uploader Script/1.0",
    "Content-type": content_type,
    "X-API-Key": api_key}

conn = HTTPConnection('%(game)s.curseforge.com' % { 'game': game })
conn.request("POST", '/projects/%(slug)s/upload-file.json' % {'slug': project_slug}, body, headers)
response = conn.getresponse()
if response.status == 201:
    print "Successfully uploaded %(name)s" % { 'name': name }
elif response.status == 422:
    assert response.content_type == 'application/json'
    errors = json.loads(response.read())
    print "Form error with uploading %(name)s:" % { 'name': name }
    for k, items in errors.iteritems():
        for item in items:
            print "    %(k)s: %(item)s" % { 'k': k, 'name': name }
else:
    print "Error with uploading %(name)s: %(status)d %(reason)s" % { 'name': name, 'status': response.status, 'reason': response.reason }

def is_ascii(s):
return all(ord(c) < 128 for c in s)

def encode_multipart_formdata(fields, files):
"""
Encode data in multipart/form-data format.

`fields`
    A sequence of (name, value) elements for regular form fields.

`files`
    A sequence of (name, filename) elements for data to be uploaded as files
Return (content_type, body) ready for httplib.HTTP instance
"""
boundary = choose_boundary()
L = []

for key, value in fields:
    if value is None:
        value = ''
    elif value is False:
        continue

    L.append('--%(boundary)s' % {'boundary': boundary})
    L.append('Content-Disposition: form-data; name="%(name)s"' % {'name': key})
    L.append('')
    L.append(value)

for key, filename in files:
    f = file(filename, 'rb')
    filedata = f.read()
    f.close()
    L.append('--%(boundary)s' % {'boundary': boundary})
    L.append('Content-Disposition: form-data; name="%(name)s"; filename="%(filename)s"' % { 'name': key, 'filename': basename(filename) })
    L.append('Content-Type: application/zip')
    L.append('')
    L.append(filedata)

L.append('--%(boundary)s--' % {'boundary': boundary})
L.append('')

for i in range(len(L)):
    value = L[i]
    if not is_ascii(value):
        L[i] = value.encode("ascii")

body = '\r\n'.join(L)
content_type = 'multipart/form-data; boundary=%(boundary)s' % { 'boundary': boundary }
return content_type, body

我该怎么解决这个问题呢?


编辑3:应要求,打印变量的完整结果。

fields: [('name', u'2.0.3'), ('game_version', u'1'), ('game_version', u'4'), ('game_version', u'9'), ('file_type', 'r'), ('change_log', u'====== 2.0.3\n* Jaliborc: Fixed a bug causing wrong items to be shown for leather, mail and plate slots\n* Jaliborc: Items are now organized by level as well\n\n====== 2.0.2\n* Jaliborc: Completly rewritten the categories dropdown to fix a bug\n\n====== 2.0.1\n* Jaliborc: Updated for patch 4.2\n* Jaliborc: Included all Firelands items\n\n===== 2.0.0\n* Jaliborc: Now works with 4.1\n* Jaliborc: Completely redesigned and improved\n* Jaliborc: Includes **all** items in-game right from the start\n* Jaliborc: Searches trough thousands of items in a blaze\n* Jaliborc: Mostly //Load on Demand//\n* Jaliborc: Only works on English clients. Versions for other clients should be released in a close future.\n\n====== 1.8.7\n* Added linkerator support for multiple chat frames\n\n====== 1.8.6\n* Fixed a bug when linking an item from the chat frame. \n\n====== 1.8.5\n* Added compatibility with WoW 3.3.5\n\n====== 1.8.3\n* Bumped TOC for 3.3\n\n====== 1.8.2\n* Bumped TOC for 3.2\n\n====== 1.8.1\n* TOC Bump + Potential WIM bugfix\n\n===== 1.8.0\n* Added "Heirloom" option to quality selector\n* Fixed a bug causing the DB to be reloaded on item scroll\n* Cleaned up the code a bit.  Still need to work on the GUI/localization\n* Altered slash commands.  See addon description for details.\n\n====== 1.7.2\n* Bumped the max item ID to check from 40k to 60k.  Glyphs, etc, should now appear.\n\n====== 1.7.1\n* Fixed a crash issue when linking tradeskills\n\n===== 1.7.0\n* Made Wrath compatible\n* Seems to be causing a lot more CPU usage now, will investigate later.'), ('change_markup_type', 'creole'), ('known_caveats', ''), ('caveats_markup_type', 'plain')]

files: [('file', u'/Users/Jaliborc/Desktop/Ludwig 2.0.3.zip')]

看起来里面包含了一些Unicode字符串。我需要把它们都编码吗?

1 个回答

1

很可能ISO-8859-1并不是你第一个问题的解决办法。你需要知道的是,any_random_gibberish.decode('ISO-8859-1')这个操作是不会失败的。

其次,我不太明白为什么在上传文件时需要编码——上传文件的目的不就是要在服务器上完全复制这个文件吗?把一个压缩文件解码成unicode对象听起来很奇怪。

如果你能把你遇到的错误(“读取文件时出现编码错误”)和完整的错误追踪信息发布出来,那会是个好主意,这样别人就能帮你了。另外,你提到的API的URL也需要提供。

更新 你说在这一行body = '\r\n'.join(L)中遇到了“ascii错误”。根据你提供的信息,一个合理的猜测是你可能遇到了这个问题:

>>> "".join([u"foo", "\xff"])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 0: ordinal not in range(128)

u"foo" + "\xff"会产生相同的结果。

发生的情况是你有unicodestr对象的混合。把它们连接在一起需要将str对象转换为unicode,而这个转换是通过默认编码进行的,通常是ascii,当str对象不是ASCII时,这个转换就会失败。

在这种情况下,问题不在于str对象,而在于unicode对象:你不能直接发送未编码的unicode对象。

我建议你把这段代码:

for key, filename in files:
    f = file(filename, 'r')
    filedata = f.read().decode("ISO-8859-1")

替换为这段:

for key, filename in files:
    f = file(filename, 'rb') # Specify binary mode in case this gets run on Windows
    filedata = f.read() # don't decode it

并在进入那个函数后立即打印它的参数,这样你就能清楚地看到哪些是unicode

print "fields:", repr(fields)
print "files:", repr(files)

很可能所有的unicode对象都可以安全地转换为ascii,方法是(显式地)使用unicode_object.encode("ascii")

更新 2: 值得调查一下为什么你的某些值是unicode,而有些是str。看起来所有的unicode都可以安全地编码为ascii

new = [(k, v.encode('ascii') if isinstance(v, unicode) else v) for k, v in original]

撰写回答