在自定义Django上传处理程序中访问其他表单字段

1 投票

2 回答

1029 浏览

提问于 2025-04-17 18:52

我为我现在的项目写了一个自定义的Django文件上传处理器。这是一个概念验证，允许你在不把文件存储到磁盘上的情况下计算上传文件的哈希值。虽然只是一个概念验证，但如果我能让它工作，我就可以开始我工作的真正目的。

基本上，我目前的进展是这样的，运行得很好，但有一个主要的问题：

from django.core.files.uploadhandler import *
from hashlib import sha256
from myproject.upload.files import MyProjectUploadedFile

class MyProjectUploadHandler(FileUploadHandler):
    def __init__(self, *args, **kwargs):
        super(MyProjectUploadHandler, self).__init__(*args, **kwargs)

    def handle_raw_input(self, input_data, META, content_length, boundary,
            encoding = None):
        self.activated = True

    def new_file(self, *args, **kwargs):
        super(MyProjectUploadHandler, self).new_file(*args, **kwargs)

        self.digester = sha256()
        raise StopFutureHandlers()

    def receive_data_chunk(self, raw_data, start):
        self.digester.update(raw_data)

    def file_complete(self, file_size):
        return MyProjectUploadedFile(self.digester.hexdigest())

这个自定义上传处理器工作得很不错。哈希值是准确的，并且在任何时候只使用64kb的内存，而不需要把上传的文件存储到磁盘上。

我唯一遇到的问题是，在处理文件之前，我需要访问POST请求中的另一个字段，也就是用户输入的文本盐值。我的表单看起来是这样的：

<form id="myForm" method="POST" enctype="multipart/form-data" action="/upload/">
    <fieldset>
        <input name="salt" type="text" placeholder="Salt">
        <input name="uploadfile" type="file">
        <input type="submit">
    </fieldset>
</form>

这个“盐”POST变量只有在请求被处理并且文件上传后才能使用，这对我的用例来说不太合适。我似乎找不到任何方法在我的上传处理器中访问这个变量。

有没有办法让我在接收到每个多部分变量时就能访问它，而不仅仅是上传的文件呢？

django 内存管理文件上传 POST请求表单字段自定义处理器哈希计算多部分变量

2 个回答

根据在第62行找到的FileUploadHandler的代码：

https://github.com/django/django/blob/master/django/core/files/uploadhandler.py

看起来请求对象会被传递到处理器中，并存储为self.request

这样一来，你应该可以在上传处理器的任何地方通过以下方式访问盐值：

salt = self.request.POST.get('salt')

除非我误解了你的问题。

回答于 2025-04-17 由 Python大师

分享举报

我的解决方案不是轻而易举就能想到的，但我还是找到了：

class IntelligentUploadHandler(FileUploadHandler):
    """
    An upload handler which overrides the default multipart parser to allow
    simultaneous parsing of fields and files... intelligently. Subclass this
    for real and true awesomeness.
    """

    def __init__(self, *args, **kwargs):
        super(IntelligentUploadHandler, self).__init__(*args, **kwargs)

    def field_parsed(self, field_name, field_value):
        """
        A callback method triggered when a non-file field has been parsed 
        successfully by the parser. Use this to listen for new fields being
        parsed.
        """
        pass

    def handle_raw_input(self, input_data, META, content_length, boundary,
            encoding = None):
        """
        Parse the raw input from the HTTP request and split items into fields
        and files, executing callback methods as necessary.

        Shamelessly adapted and borrowed from django.http.multiparser.MultiPartParser.
        """
        # following suit from the source class, this is imported here to avoid
        # a potential circular import
        from django.http import QueryDict

        # create return values
        self.POST = QueryDict('', mutable=True)
        self.FILES = MultiValueDict()

        # initialize the parser and stream
        stream = LazyStream(ChunkIter(input_data, self.chunk_size))

        # whether or not to signal a file-completion at the beginning of the loop.
        old_field_name = None
        counter = 0

        try:
            for item_type, meta_data, field_stream in Parser(stream, boundary):
                if old_field_name:
                    # we run this test at the beginning of the next loop since
                    # we cannot be sure a file is complete until we hit the next
                    # boundary/part of the multipart content.
                    file_obj = self.file_complete(counter)

                    if file_obj:
                        # if we return a file object, add it to the files dict
                        self.FILES.appendlist(force_text(old_field_name, encoding,
                            errors='replace'), file_obj)

                    # wipe it out to prevent havoc
                    old_field_name = None
                try: 
                    disposition = meta_data['content-disposition'][1]
                    field_name = disposition['name'].strip()
                except (KeyError, IndexError, AttributeError):
                    continue

                transfer_encoding = meta_data.get('content-transfer-encoding')

                if transfer_encoding is not None:
                    transfer_encoding = transfer_encoding[0].strip()

                field_name = force_text(field_name, encoding, errors='replace')

                if item_type == FIELD:
                    # this is a POST field
                    if transfer_encoding == "base64":
                        raw_data = field_stream.read()
                        try:
                            data = str(raw_data).decode('base64')
                        except:
                            data = raw_data
                    else:
                        data = field_stream.read()

                    self.POST.appendlist(field_name, force_text(data, encoding,
                        errors='replace'))

                    # trigger listener
                    self.field_parsed(field_name, self.POST.get(field_name))
                elif item_type == FILE:
                    # this is a file
                    file_name = disposition.get('filename')

                    if not file_name:
                        continue

                    # transform the file name
                    file_name = force_text(file_name, encoding, errors='replace')
                    file_name = self.IE_sanitize(unescape_entities(file_name))

                    content_type = meta_data.get('content-type', ('',))[0].strip()

                    try:
                        charset = meta_data.get('content-type', (0, {}))[1].get('charset', None)
                    except:
                        charset = None

                    try:
                        file_content_length = int(meta_data.get('content-length')[0])
                    except (IndexError, TypeError, ValueError):
                        file_content_length = None

                    counter = 0

                    # now, do the important file stuff
                    try:
                        # alert on the new file
                        self.new_file(field_name, file_name, content_type,
                                file_content_length, charset)

                        # chubber-chunk it
                        for chunk in field_stream:
                            if transfer_encoding == "base64":
                                # base 64 decode it if need be
                                over_bytes = len(chunk) % 4

                                if over_bytes:
                                    over_chunk = field_stream.read(4 - over_bytes)
                                    chunk += over_chunk

                                try:
                                    chunk = base64.b64decode(chunk)
                                except Exception as e:
                                    # since this is anly a chunk, any error is an unfixable error
                                    raise MultiPartParserError("Could not decode base64 data: %r" % e)

                            chunk_length = len(chunk)
                            self.receive_data_chunk(chunk, counter)
                            counter += chunk_length
                            # ... and we're done
                    except SkipFile:
                        # just eat the rest
                        exhaust(field_stream)
                    else:
                        # handle file upload completions on next iteration
                        old_field_name = field_name

        except StopUpload as e:
            # if we get a request to stop the upload, exhaust it if no con reset
            if not e.connection_reset:
                exhaust(input_data)
        else:
            # make sure that the request data is all fed
            exhaust(input_data)

        # signal the upload has been completed
        self.upload_complete()

        return self.POST, self.FILES

    def IE_sanitize(self, filename):
        """Cleanup filename from Internet Explorer full paths."""
        return filename and filename[filename.rfind("\\")+1:].strip()

简单来说，通过创建这个类的子类，你可以拥有一个更“聪明”的上传处理器。字段会通过 field_parsed 方法通知子类，这正是我所需要的。

我已经把这个作为一个功能请求提交给Django团队，希望这个功能能成为Django的常规工具，而不是像上面那样去修改源代码。

回答于 2025-04-17 由 Python大师

分享举报

在自定义Django上传处理程序中访问其他表单字段

2 个回答

撰写回答