数据科学项目的pyscaffold扩展
pyscaffoldext-dsproject的Python项目详细描述
PyscaffolText DSProject
PyScaffold为data science项目定制的扩展。此扩展的灵感来自 cookiecutter-data-science并在许多方面得到增强。主要区别在于
- 提倡一个合适的python包结构,它可以被发送和分发,
- 使用conda环境而不是基于virtualenv的环境,因此更适合 对于数据科学项目,
- 为Sphinx、py.test、pre-commit等创建更多默认配置 干净的编码和最佳实践。
还可以考虑使用dvc来进行版本控制,并在团队中共享数据。
最终的目录结构如下:
├── AUTHORS.rst <- List of developers and maintainers.
├── CHANGELOG.rst <- Changelog to keep track of new features and fixes.
├── LICENSE.txt <- License as chosen on the command-line.
├── README.md <- The top-level README for developers.
├── configs <- Directory for configurations of model & application.
├── data
│ ├── external <- Data from third party sources.
│ ├── interim <- Intermediate data that has been transformed.
│ ├── processed <- The final, canonical data sets for modeling.
│ └── raw <- The original, immutable data dump.
├── docs <- Directory for Sphinx documentation in rst or md.
├── environment.yaml <- The conda environment file for reproducibility.
├── models <- Trained and serialized models, model predictions,
│ or model summaries.
├── notebooks <- Jupyter notebooks. Naming convention is a number (for
│ ordering), the creator's initials and a description,
│ e.g. `1.0-fw-initial-data-exploration`.
├── references <- Data dictionaries, manuals, and all other materials.
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
│ └── figures <- Generated plots and figures for reports.
├── scripts <- Analysis and production scripts which import the
│ actual PYTHON_PKG, e.g. train_model.
├── setup.cfg <- Declarative configuration of your project.
├── setup.py <- Make this project pip installable with `pip install -e`
│ or `python setup.py develop`.
├── src
│ └── PYTHON_PKG <- Actual Python package where the main functionality goes.
├── tests <- Unit tests which can be run with `py.test` or
│ `python setup.py test`.
├── .coveragerc <- Configuration for coverage reports of unit tests.
├── .isort.cfg <- Configuration for git hook that sorts imports.
└── .pre-commit-config.yaml <- Configuration of pre-commit git hooks.
请参阅dsproject-demo下的初始项目结构演示,并查看 有关详细信息,请参阅PyScaffold的文档。
用法
只要用pip install pyscaffoldext-dsproject
安装这个包
注意putup -h
显示了一个新选项--dsproject
。
创建一个数据科学项目非常简单:
putup --dsproject my_ds_project
注
此项目是使用Pyscaffold 3.2设置的。详细信息和用法 有关pyscaffold的信息,请参见https://pyscaffold.org/。