支持数据科学项目的python类。

resumableds的Python项目详细描述


可恢复的

支持数据科学项目的python类。

Resumabled支持您编写数据科学脚本,包括保存/恢复功能。 可以保存和恢复数据,避免从数据存储中不必要地检索原始数据。 数据目录结构的灵感来自cookiecutter数据科学(https://drivendata.github.io/cookiecutter-data-science/)。 类还支持语句“analysisisadag”(https://drivendata.github.io/cookiecutter-data-science/#analysis-is-a-dag)。

resumabled是用纯Python编写的,打算在Jupyter笔记本中使用

示例

<code> 
proj1 = RdsProject('project1') # create object from class (creates the dir if it doesn't exist yet)
proj1.raw.df1 = pd.DataFrame() # create dataframe as attribute of proj1.raw (RdsFs 'raw')
proj1.defs.variable1 = 'foo' # create simple objects as attribute of proj1.defs (RdsFs 'defs')
proj1.save() # saved attributes of all RfdFs in proj1 to disk
</code>
This will result in the following directory structure (plus some overhead of internals):
- <output_dir>/defs/var_variable1.pkl
- <output_dir>/raw/df1.pkl
- <output_dir>/raw/df1.csv

Note, pandas dataframes are always dumped as pickle for further processing and as csv for easy exploration. The csv files are never read back anymore.

Later on or in another python session, you can do this:
proj2 = RdsProject('project1') # create object from class (doesn't touch the dir as it already exists) All vars and data is read back to their original names.
proj2.defs.variable1 == 'foo' ==> True
isinstance(proj2.raw.df1, pd.DataFrame) ==> True

可恢复的指示灯

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
Cassandra DB的javascript查询结果   java定时器任务调度   java在TLS上实现LDAP   为什么在浏览器中滚动会使java小程序闪烁?   未使用轮询器和serviceactivator提取java pubsub消息   java风暴多线程问题   java计算平均成绩   java将字符串添加到另一个类的数组列表中   文件Java路径如何转换为例如InputStream   java更改JComboBox的字体颜色   java inthttp:具有可轮询请求通道的inboundgateway   使用继承在Java中运行swing Base和扩展windows   java ivysettings。xml:添加本地maven路径   java如何将参数自定义视图传递给activity类   java延迟加载无法在Hibernate中使用一对一映射   当文件以“file:/”开头时,Java无法识别该文件   需要java正则表达式帮助,使用反斜杠   片段中的java GWT参数