Python HttpConnection 请求编码错误
我正在尝试用一个Python脚本把一个zip文件上传到一个网站。这个网站提供了一个API,专门用于这个目的。但是,当我尝试使用这个API时,在把所有要发送的字符串组合起来时,出现了编码错误。我追踪到出问题的字符串是filedata
(我的zip文件)。
Traceback (most recent call last):
File "/Library/Application Junk/ProjectManager/Main.py", line 146, in OnUpload CurseUploader.upload_file('77353ba57bdeb5346d1b3830ed36171279763e35', 'wow', slug, version, VersionID, 'r', logText or '', 'creole', '', 'plain', zipPath)
File "/Library/Application Junk/ProjectManager/CurseUploader.py", line 83, in upload_file
content_type, body = encode_multipart_formdata(params, [('file', filepath)])
File "/Library/Application Junk/ProjectManager/CurseUploader.py", line 153, in encode_multipart_formdata
body = '\r\n'.join(L)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xcb in position 10: ordinal not in range(128)
编辑:应要求,提供完整代码。
编辑2:按照建议,我尝试把所有非ASCII字符串编码成ASCII格式。结果出现了同样的错误,不过这次是在L[i] = value.encode("ascii")
这一行。
from httplib import HTTPConnection
from os.path import basename, exists
from mimetools import choose_boundary
try:
import simplejson as json
except ImportError:
import json
def get_game_versions(game):
"""
Return the JSON response as given from /game-versions.json from curseforge.com of the given game
`game`
The shortened version of the game, e.g. "wow", "war", or "rom"
"""
conn = HTTPConnection('%(game)s.curseforge.com' % { 'game': game })
conn.request("GET", '/game-versions.json')
response = conn.getresponse()
assert response.status == 200, "%(status)d %(reason)s from /game-versions.json" % { 'status': response.status, 'reason': response.reason }
assert response.content_type == 'application/json'
data = json.loads(response.read())
return data
def upload_file(api_key, game, project_slug, name, game_version_ids, file_type, change_log, change_markup_type, known_caveats, caveats_markup_type, filepath):
"""
Upload a file to CurseForge.com on your project
`api_key`
The api-key from http://www.curseforge.com/home/api-key/
`game`
The shortened version of the game, e.g. "wow", "war", or "rom"
`project_slug`
The slug of your project, e.g. "my-project"
`name`
The name of the file you're uploading, this should be the version's name, do not include your project's name.
`game_version_ids`
A set of game version ids.
`file_type`
Specify 'a' for Alpha, 'b' for Beta, and 'r' for Release.
`change_log`
The change log of the file. Up to 50k characters is acceptable.
`change_markup_type`
Markup type for your change log. creole or plain is recommended.
`known_caveats`
The known caveats of the file. Up to 50k characters is acceptable.
`caveats_markup_type`
Markup type for your known caveats. creole or plain is recommended.
`filepath`
The path to the file to upload.
"""
assert len(api_key) == 40
assert 1 <= len(game_version_ids) <= 3
assert file_type in ('r', 'b', 'a')
assert exists(filepath)
params = []
params.append(('name', name))
for game_version_id in game_version_ids:
params.append(('game_version', game_version_id))
params.append(('file_type', file_type))
params.append(('change_log', change_log))
params.append(('change_markup_type', change_markup_type))
params.append(('known_caveats', known_caveats))
params.append(('caveats_markup_type', caveats_markup_type))
content_type, body = encode_multipart_formdata(params, [('file', filepath)])
print('Got here?')
headers = {
"User-Agent": "CurseForge Uploader Script/1.0",
"Content-type": content_type,
"X-API-Key": api_key}
conn = HTTPConnection('%(game)s.curseforge.com' % { 'game': game })
conn.request("POST", '/projects/%(slug)s/upload-file.json' % {'slug': project_slug}, body, headers)
response = conn.getresponse()
if response.status == 201:
print "Successfully uploaded %(name)s" % { 'name': name }
elif response.status == 422:
assert response.content_type == 'application/json'
errors = json.loads(response.read())
print "Form error with uploading %(name)s:" % { 'name': name }
for k, items in errors.iteritems():
for item in items:
print " %(k)s: %(item)s" % { 'k': k, 'name': name }
else:
print "Error with uploading %(name)s: %(status)d %(reason)s" % { 'name': name, 'status': response.status, 'reason': response.reason }
def is_ascii(s):
return all(ord(c) < 128 for c in s)
def encode_multipart_formdata(fields, files):
"""
Encode data in multipart/form-data format.
`fields`
A sequence of (name, value) elements for regular form fields.
`files`
A sequence of (name, filename) elements for data to be uploaded as files
Return (content_type, body) ready for httplib.HTTP instance
"""
boundary = choose_boundary()
L = []
for key, value in fields:
if value is None:
value = ''
elif value is False:
continue
L.append('--%(boundary)s' % {'boundary': boundary})
L.append('Content-Disposition: form-data; name="%(name)s"' % {'name': key})
L.append('')
L.append(value)
for key, filename in files:
f = file(filename, 'rb')
filedata = f.read()
f.close()
L.append('--%(boundary)s' % {'boundary': boundary})
L.append('Content-Disposition: form-data; name="%(name)s"; filename="%(filename)s"' % { 'name': key, 'filename': basename(filename) })
L.append('Content-Type: application/zip')
L.append('')
L.append(filedata)
L.append('--%(boundary)s--' % {'boundary': boundary})
L.append('')
for i in range(len(L)):
value = L[i]
if not is_ascii(value):
L[i] = value.encode("ascii")
body = '\r\n'.join(L)
content_type = 'multipart/form-data; boundary=%(boundary)s' % { 'boundary': boundary }
return content_type, body
我该怎么解决这个问题呢?
编辑3:应要求,打印变量的完整结果。
fields: [('name', u'2.0.3'), ('game_version', u'1'), ('game_version', u'4'), ('game_version', u'9'), ('file_type', 'r'), ('change_log', u'====== 2.0.3\n* Jaliborc: Fixed a bug causing wrong items to be shown for leather, mail and plate slots\n* Jaliborc: Items are now organized by level as well\n\n====== 2.0.2\n* Jaliborc: Completly rewritten the categories dropdown to fix a bug\n\n====== 2.0.1\n* Jaliborc: Updated for patch 4.2\n* Jaliborc: Included all Firelands items\n\n===== 2.0.0\n* Jaliborc: Now works with 4.1\n* Jaliborc: Completely redesigned and improved\n* Jaliborc: Includes **all** items in-game right from the start\n* Jaliborc: Searches trough thousands of items in a blaze\n* Jaliborc: Mostly //Load on Demand//\n* Jaliborc: Only works on English clients. Versions for other clients should be released in a close future.\n\n====== 1.8.7\n* Added linkerator support for multiple chat frames\n\n====== 1.8.6\n* Fixed a bug when linking an item from the chat frame. \n\n====== 1.8.5\n* Added compatibility with WoW 3.3.5\n\n====== 1.8.3\n* Bumped TOC for 3.3\n\n====== 1.8.2\n* Bumped TOC for 3.2\n\n====== 1.8.1\n* TOC Bump + Potential WIM bugfix\n\n===== 1.8.0\n* Added "Heirloom" option to quality selector\n* Fixed a bug causing the DB to be reloaded on item scroll\n* Cleaned up the code a bit. Still need to work on the GUI/localization\n* Altered slash commands. See addon description for details.\n\n====== 1.7.2\n* Bumped the max item ID to check from 40k to 60k. Glyphs, etc, should now appear.\n\n====== 1.7.1\n* Fixed a crash issue when linking tradeskills\n\n===== 1.7.0\n* Made Wrath compatible\n* Seems to be causing a lot more CPU usage now, will investigate later.'), ('change_markup_type', 'creole'), ('known_caveats', ''), ('caveats_markup_type', 'plain')]
files: [('file', u'/Users/Jaliborc/Desktop/Ludwig 2.0.3.zip')]
看起来里面包含了一些Unicode字符串。我需要把它们都编码吗?
1 个回答
很可能ISO-8859-1并不是你第一个问题的解决办法。你需要知道的是,any_random_gibberish.decode('ISO-8859-1')
这个操作是不会失败的。
其次,我不太明白为什么在上传文件时需要编码——上传文件的目的不就是要在服务器上完全复制这个文件吗?把一个压缩文件解码成unicode
对象听起来很奇怪。
如果你能把你遇到的错误(“读取文件时出现编码错误”)和完整的错误追踪信息发布出来,那会是个好主意,这样别人就能帮你了。另外,你提到的API的URL也需要提供。
更新 你说在这一行body = '\r\n'.join(L)
中遇到了“ascii错误”。根据你提供的信息,一个合理的猜测是你可能遇到了这个问题:
>>> "".join([u"foo", "\xff"])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 0: ordinal not in range(128)
u"foo" + "\xff"
会产生相同的结果。
发生的情况是你有unicode
和str
对象的混合。把它们连接在一起需要将str
对象转换为unicode
,而这个转换是通过默认编码进行的,通常是ascii
,当str
对象不是ASCII时,这个转换就会失败。
在这种情况下,问题不在于str
对象,而在于unicode
对象:你不能直接发送未编码的unicode
对象。
我建议你把这段代码:
for key, filename in files:
f = file(filename, 'r')
filedata = f.read().decode("ISO-8859-1")
替换为这段:
for key, filename in files:
f = file(filename, 'rb') # Specify binary mode in case this gets run on Windows
filedata = f.read() # don't decode it
并在进入那个函数后立即打印它的参数,这样你就能清楚地看到哪些是unicode
:
print "fields:", repr(fields)
print "files:", repr(files)
很可能所有的unicode
对象都可以安全地转换为ascii
,方法是(显式地)使用unicode_object.encode("ascii")
。
更新 2: 值得调查一下为什么你的某些值是unicode
,而有些是str
。看起来所有的unicode
都可以安全地编码为ascii
:
new = [(k, v.encode('ascii') if isinstance(v, unicode) else v) for k, v in original]