使用Jupyter中的Browse按钮上载文件并使用/保存它们

2024-05-23 19:06:17 发布

您现在位置:Python中文网/ 问答频道 /正文

我遇到了在Jupyter中上传文件的this snippet,但是我不知道如何在执行代码的机器上保存这个文件,也不知道如何显示上传文件的前5行。基本上,我在寻找合适的命令,以便在文件上传后访问它:

import io
from IPython.display import display
import fileupload

def _upload():

    _upload_widget = fileupload.FileUploadWidget()

    def _cb(change):
        decoded = io.StringIO(change['owner'].data.decode('utf-8'))
        filename = change['owner'].filename
        print('Uploaded `{}` ({:.2f} kB)'.format(
            filename, len(decoded.read()) / 2 **10))

    _upload_widget.observe(_cb, names='data')
    display(_upload_widget)

_upload()

Tags: 文件ioimportdatadefdisplayjupyterwidget
3条回答

上载完成时调用_cb。如上述注释所述,您可以在其中写入文件,或将其存储在变量中。例如:

from IPython.display import display
import fileupload

uploader = fileupload.FileUploadWidget()

def _handle_upload(change):
    w = change['owner']
    with open(w.filename, 'wb') as f:
        f.write(w.data)
    print('Uploaded `{}` ({:.2f} kB)'.format(
        w.filename, len(w.data) / 2**10))

uploader.observe(_handle_upload, names='data')

display(uploader)

上传完成后,您可以通过以下方式访问文件名:

uploader.filename

两年前我偶然发现了这个线索。对于那些仍然对如何使用fileupload小部件感到困惑的人,我将minrk发布的优秀答案与下面的一些其他使用示例结合起来。

from IPython.display import display
import fileupload

uploader = fileupload.FileUploadWidget()

def _handle_upload(change):
    w = change['owner']
    with open(w.filename, 'wb') as f:
        f.write(w.data)
    print('Uploaded `{}` ({:.2f} kB)'.format(
        w.filename, len(w.data) / 2**10))

uploader.observe(_handle_upload, names='data')

display(uploader)

从小部件文档:

class FileUploadWidget(ipywidgets.DOMWidget):
    '''File Upload Widget.
    This widget provides file upload using `FileReader`.
    '''
    _view_name = traitlets.Unicode('FileUploadView').tag(sync=True)
    _view_module = traitlets.Unicode('fileupload').tag(sync=True)

    label = traitlets.Unicode(help='Label on button.').tag(sync=True)
    filename = traitlets.Unicode(help='Filename of `data`.').tag(sync=True)
    data_base64 = traitlets.Unicode(help='File content, base64 encoded.'
                                    ).tag(sync=True)
    data = traitlets.Bytes(help='File content.')

    def __init__(self, label="Browse", *args, **kwargs):
        super(FileUploadWidget, self).__init__(*args, **kwargs)
        self._dom_classes += ('widget_item', 'btn-group')
        self.label = label

    def _data_base64_changed(self, *args):
        self.data = base64.b64decode(self.data_base64.split(',', 1)[1])

获取bytestring格式的数据:

uploader.data

获取常规utf-8字符串中的数据:

datastr= str(uploader.data,'utf-8')

从utf-8字符串(例如从.csv输入)生成新的pandas数据帧:

import pandas as pd
from io import StringIO

datatbl = StringIO(datastr)
newdf = pd.read_table(datatbl,sep=',',index_col=None)

我正在使用Jupyter notebook开发ML,我正在寻找通过在本地文件系统中浏览来选择包含数据集的本地文件的解决方案。尽管如此,这里的问题更多的是指上传而不是选择一个文件。我在这里放了一个我发现的片段here,因为当我在为我的特定案例寻找解决方案时,搜索结果花了我好几次的时间。

import os
import ipywidgets as widgets

class FileBrowser(object):
    def __init__(self):
        self.path = os.getcwd()
        self._update_files()

    def _update_files(self):
        self.files = list()
        self.dirs = list()
        if(os.path.isdir(self.path)):
            for f in os.listdir(self.path):
                ff = os.path.join(self.path, f)
                if os.path.isdir(ff):
                    self.dirs.append(f)
                else:
                    self.files.append(f)

    def widget(self):
        box = widgets.VBox()
        self._update(box)
        return box

    def _update(self, box):

        def on_click(b):
            if b.description == '..':
                self.path = os.path.split(self.path)[0]
            else:
                self.path = os.path.join(self.path, b.description)
            self._update_files()
            self._update(box)

        buttons = []
        if self.files:
            button = widgets.Button(description='..', background_color='#d0d0ff')
            button.on_click(on_click)
            buttons.append(button)
        for f in self.dirs:
            button = widgets.Button(description=f, background_color='#d0d0ff')
            button.on_click(on_click)
            buttons.append(button)
        for f in self.files:
            button = widgets.Button(description=f)
            button.on_click(on_click)
            buttons.append(button)
        box.children = tuple([widgets.HTML("<h2>%s</h2>" % (self.path,))] + buttons)

使用它:

f = FileBrowser()
f.widget()
#   <interact with widget, select a path>
# in a separate cell:
f.path # returns the selected path

相关问题 更多 >