在自定义Django上传处理程序中访问其他表单字段
我为我现在的项目写了一个自定义的Django文件上传处理器。这是一个概念验证,允许你在不把文件存储到磁盘上的情况下计算上传文件的哈希值。虽然只是一个概念验证,但如果我能让它工作,我就可以开始我工作的真正目的。
基本上,我目前的进展是这样的,运行得很好,但有一个主要的问题:
from django.core.files.uploadhandler import *
from hashlib import sha256
from myproject.upload.files import MyProjectUploadedFile
class MyProjectUploadHandler(FileUploadHandler):
def __init__(self, *args, **kwargs):
super(MyProjectUploadHandler, self).__init__(*args, **kwargs)
def handle_raw_input(self, input_data, META, content_length, boundary,
encoding = None):
self.activated = True
def new_file(self, *args, **kwargs):
super(MyProjectUploadHandler, self).new_file(*args, **kwargs)
self.digester = sha256()
raise StopFutureHandlers()
def receive_data_chunk(self, raw_data, start):
self.digester.update(raw_data)
def file_complete(self, file_size):
return MyProjectUploadedFile(self.digester.hexdigest())
这个自定义上传处理器工作得很不错。哈希值是准确的,并且在任何时候只使用64kb的内存,而不需要把上传的文件存储到磁盘上。
我唯一遇到的问题是,在处理文件之前,我需要访问POST请求中的另一个字段,也就是用户输入的文本盐值。我的表单看起来是这样的:
<form id="myForm" method="POST" enctype="multipart/form-data" action="/upload/">
<fieldset>
<input name="salt" type="text" placeholder="Salt">
<input name="uploadfile" type="file">
<input type="submit">
</fieldset>
</form>
这个“盐”POST变量只有在请求被处理并且文件上传后才能使用,这对我的用例来说不太合适。我似乎找不到任何方法在我的上传处理器中访问这个变量。
有没有办法让我在接收到每个多部分变量时就能访问它,而不仅仅是上传的文件呢?
2 个回答
0
根据在第62行找到的FileUploadHandler
的代码:
https://github.com/django/django/blob/master/django/core/files/uploadhandler.py
看起来请求对象会被传递到处理器中,并存储为self.request
这样一来,你应该可以在上传处理器的任何地方通过以下方式访问盐值:
salt = self.request.POST.get('salt')
除非我误解了你的问题。
2
我的解决方案不是轻而易举就能想到的,但我还是找到了:
class IntelligentUploadHandler(FileUploadHandler):
"""
An upload handler which overrides the default multipart parser to allow
simultaneous parsing of fields and files... intelligently. Subclass this
for real and true awesomeness.
"""
def __init__(self, *args, **kwargs):
super(IntelligentUploadHandler, self).__init__(*args, **kwargs)
def field_parsed(self, field_name, field_value):
"""
A callback method triggered when a non-file field has been parsed
successfully by the parser. Use this to listen for new fields being
parsed.
"""
pass
def handle_raw_input(self, input_data, META, content_length, boundary,
encoding = None):
"""
Parse the raw input from the HTTP request and split items into fields
and files, executing callback methods as necessary.
Shamelessly adapted and borrowed from django.http.multiparser.MultiPartParser.
"""
# following suit from the source class, this is imported here to avoid
# a potential circular import
from django.http import QueryDict
# create return values
self.POST = QueryDict('', mutable=True)
self.FILES = MultiValueDict()
# initialize the parser and stream
stream = LazyStream(ChunkIter(input_data, self.chunk_size))
# whether or not to signal a file-completion at the beginning of the loop.
old_field_name = None
counter = 0
try:
for item_type, meta_data, field_stream in Parser(stream, boundary):
if old_field_name:
# we run this test at the beginning of the next loop since
# we cannot be sure a file is complete until we hit the next
# boundary/part of the multipart content.
file_obj = self.file_complete(counter)
if file_obj:
# if we return a file object, add it to the files dict
self.FILES.appendlist(force_text(old_field_name, encoding,
errors='replace'), file_obj)
# wipe it out to prevent havoc
old_field_name = None
try:
disposition = meta_data['content-disposition'][1]
field_name = disposition['name'].strip()
except (KeyError, IndexError, AttributeError):
continue
transfer_encoding = meta_data.get('content-transfer-encoding')
if transfer_encoding is not None:
transfer_encoding = transfer_encoding[0].strip()
field_name = force_text(field_name, encoding, errors='replace')
if item_type == FIELD:
# this is a POST field
if transfer_encoding == "base64":
raw_data = field_stream.read()
try:
data = str(raw_data).decode('base64')
except:
data = raw_data
else:
data = field_stream.read()
self.POST.appendlist(field_name, force_text(data, encoding,
errors='replace'))
# trigger listener
self.field_parsed(field_name, self.POST.get(field_name))
elif item_type == FILE:
# this is a file
file_name = disposition.get('filename')
if not file_name:
continue
# transform the file name
file_name = force_text(file_name, encoding, errors='replace')
file_name = self.IE_sanitize(unescape_entities(file_name))
content_type = meta_data.get('content-type', ('',))[0].strip()
try:
charset = meta_data.get('content-type', (0, {}))[1].get('charset', None)
except:
charset = None
try:
file_content_length = int(meta_data.get('content-length')[0])
except (IndexError, TypeError, ValueError):
file_content_length = None
counter = 0
# now, do the important file stuff
try:
# alert on the new file
self.new_file(field_name, file_name, content_type,
file_content_length, charset)
# chubber-chunk it
for chunk in field_stream:
if transfer_encoding == "base64":
# base 64 decode it if need be
over_bytes = len(chunk) % 4
if over_bytes:
over_chunk = field_stream.read(4 - over_bytes)
chunk += over_chunk
try:
chunk = base64.b64decode(chunk)
except Exception as e:
# since this is anly a chunk, any error is an unfixable error
raise MultiPartParserError("Could not decode base64 data: %r" % e)
chunk_length = len(chunk)
self.receive_data_chunk(chunk, counter)
counter += chunk_length
# ... and we're done
except SkipFile:
# just eat the rest
exhaust(field_stream)
else:
# handle file upload completions on next iteration
old_field_name = field_name
except StopUpload as e:
# if we get a request to stop the upload, exhaust it if no con reset
if not e.connection_reset:
exhaust(input_data)
else:
# make sure that the request data is all fed
exhaust(input_data)
# signal the upload has been completed
self.upload_complete()
return self.POST, self.FILES
def IE_sanitize(self, filename):
"""Cleanup filename from Internet Explorer full paths."""
return filename and filename[filename.rfind("\\")+1:].strip()
简单来说,通过创建这个类的子类,你可以拥有一个更“聪明”的上传处理器。字段会通过 field_parsed
方法通知子类,这正是我所需要的。
我已经把这个作为一个 功能请求 提交给Django团队,希望这个功能能成为Django的常规工具,而不是像上面那样去修改源代码。