cgi.FieldStorage如何存储文件?

2024-05-15 10:05:13 发布

您现在位置:Python中文网/ 问答频道 /正文

所以我一直在玩原始WSGI、cgi.FieldStorage和文件上传。我只是不明白它是如何处理文件上传的。

起初,它似乎只是将整个文件存储在内存中。我想嗯,这应该很容易测试-一个大文件应该会阻塞内存!。。但是,当我请求文件时,它是一个字符串,而不是迭代器、文件对象或任何东西。

我试过阅读cgi模块的源代码,发现了一些关于临时文件的东西,但是它返回了一个异常的字符串,而不是一个文件类的对象!所以。。。它是怎么工作的?!

这是我用过的代码:

import cgi
from wsgiref.simple_server import make_server

def app(environ,start_response):
    start_response('200 OK',[('Content-Type','text/html')])
    output = """
    <form action="" method="post" enctype="multipart/form-data">
    <input type="file" name="failas" />
    <input type="submit" value="Varom" />
    </form>
    """
    fs = cgi.FieldStorage(fp=environ['wsgi.input'],environ=environ)
    f = fs.getfirst('failas')
    print type(f)
    return output


if __name__ == '__main__' :
    httpd = make_server('',8000,app)
    print 'Serving'
    httpd.serve_forever()

提前谢谢!:)


Tags: 文件对象内存字符串importformappinput
3条回答

在检查cgi module description时,有一个段落讨论如何处理文件上传。

If a field represents an uploaded file, accessing the value via the value attribute or the getvalue() method reads the entire file in memory as a string. This may not be what you want. You can test for an uploaded file by testing either the filename attribute or the file attribute. You can then read the data at leisure from the file attribute:

fileitem = form["userfile"]
if fileitem.file:
    # It's an uploaded file; count lines
    linecount = 0
    while 1:
        line = fileitem.file.readline()
        if not line: break
        linecount = linecount + 1

关于您的示例,getfirst()只是getvalue()的一个版本。 尝试替换

f = fs.getfirst('failas')

f = fs['failas'].file

这将返回一个类似文件的对象,该对象“在空闲时”可读。

最好的方法是不读取文件(甚至不读取gimel建议的每一行)。

您可以使用一些继承并从FieldStorage扩展类,然后重写make_file函数。当FieldStorage为file类型时调用make_file。

作为参考,默认make_文件如下所示:

def make_file(self, binary=None):
    """Overridable: return a readable & writable file.

    The file will be used as follows:
    - data is written to it
    - seek(0)
    - data is read from it

    The 'binary' argument is unused -- the file is always opened
    in binary mode.

    This version opens a temporary file for reading and writing,
    and immediately deletes (unlinks) it.  The trick (on Unix!) is
    that the file can still be used, but it can't be opened by
    another process, and it will automatically be deleted when it
    is closed or when the current process terminates.

    If you want a more permanent file, you derive a class which
    overrides this method.  If you want a visible temporary file
    that is nevertheless automatically deleted when the script
    terminates, try defining a __del__ method in a derived class
    which unlinks the temporary files you have created.

    """
    import tempfile
    return tempfile.TemporaryFile("w+b")

与其创建临时文件,不如在任何需要的地方永久创建文件。

使用@hasanatkazmi(在一个扭曲的应用程序中使用)的答案,我得到了如下信息:

#!/usr/bin/env python2
# -*- coding: utf-8 -*-
# -*- indent: 4 spc -*-
import sys
import cgi
import tempfile


class PredictableStorage(cgi.FieldStorage):
    def __init__(self, *args, **kwargs):
        self.path = kwargs.pop('path', None)
        cgi.FieldStorage.__init__(self, *args, **kwargs)

    def make_file(self, binary=None):
        if not self.path:
            file = tempfile.NamedTemporaryFile("w+b", delete=False)
            self.path = file.name
            return file
        return open(self.path, 'w+b')

请注意,该文件并非总是由cgi模块创建的。根据这些cgi.py行,只有当内容超过1000字节时才会创建:

if self.__file.tell() + len(line) > 1000:
    self.file = self.make_file('')

因此,您必须检查文件是否是通过查询自定义类“path字段创建的,如下所示:

if file_field.path:
    # Using an already created file...
else:
    # Creating a temporary named file to store the content.
    import tempfile
    with tempfile.NamedTemporaryFile("w+b", delete=False) as f:
        f.write(file_field.value)
        # You can save the 'f.name' field for later usage.

如果还为字段设置了Content-Length,那么该文件也应该由cgi创建。

就这样。这样,您就可以预测地存储文件,从而减少应用程序的内存使用量。

相关问题 更多 >