计划、设计和建立列车和测试矩阵

matrix-architect的Python项目详细描述


版权所有©2017。芝加哥大学(“芝加哥”)。保留所有权利。

特此授予使用、复制、修改和分发本软件(包括所有目标代码和源代码)以及任何随附文档(统称为“程序”)用于教育和非营利研究目的的许可,无需付费,且无需签署许可协议,前提是上述版权通知,本段和以下三段将出现在所有副本、修改和分发中。为免生疑问,出于教育和非营利研究目的,不包括使用本计划的任何服务或销售服务的一部分。要获得该项目的商业许可证,请联系芝加哥大学波尔斯基创业与创新中心技术商业化和许可,地址:伊利诺伊州芝加哥市东53街1452号2楼,邮编:60615。

由芝加哥大学数据科学与公共政策部创建

这个节目由芝加哥版权所有。该计划是“按原样”提供的,没有芝加哥的任何伴随服务。芝加哥不保证程序的运行是不间断的或无错误的。最终用户理解,该计划是为研究目的而制定的,因此建议不要以任何理由完全依赖该计划。

在任何情况下,芝加哥都不应对任何一方承担直接、间接、特别、偶发或后果性损害,包括因使用该程序而造成的损失,即使芝加哥已被告知有可能发生此类损害。芝加哥特别否认任何保证,包括但不限于,适销性和适合特定目的的默示保证。以下提供的程序按“原样”提供。芝加哥没有义务提供维护、支持、更新、增强或修改。

描述:建筑师

Plan, design, and build train and test matrices

[![Build Status](https://travis-ci.org/dssg/architect.svg?branch=master)](https://travis-ci.org/dssg/architect) [![codecov](https://codecov.io/gh/dssg/architect/branch/master/graph/badge.svg)](https://codecov.io/gh/dssg/architect) [![codeclimate](https://codeclimate.com/github/dssg/architect.png)](https://codeclimate.com/github/dssg/architect)

In order to run classification algorithms on source data, this data must be properly organized into design matrices. Converting cleaned data into these matrices is not a trivial task; the process of creating the needed features and labels for an experiment from source data can be complicated, creating the matrices themselves out of features and labels can be inefficient, and there is opportunity at each step to leak data backwards in time to give model trained on a matrix an unfair advantage.

The Architect addresses these issues with functionality aimed at all tasks between cleaned source data (in a PostgreSQL database) and design matrices.

## Components

  • [LabelGenerator](architect/label_generators.py): Create binary labels suitable for a design matrix by querying a database table containing outcome events.
  • [FeatureGenerator](architect/feature_generators.py): Create aggregate features suitable for a design matrix from a set of database tables containing events. Uses [collate](https://github.com/dssg/collate/) to build aggregation SQL queries.
  • [FeatureGroupCreator](architect/feature_group_creator.py), [FeatureGroupMixer](architect/feature_group_mixer.py): Create groupings of features, and mix them using different strategies (like ‘leave one out’) to test their effectiveness.
  • [Planner](architect/planner.py), [Builder](architect/builders.py): Build all design matrices needed for an experiment, taking into account different labels, state configurations, and feature groups.

In addition to being usable individually to assist in different aspects of building matrices in your project, the Architect components are integrated in [triage](https://github.com/dssg/triage) as a part of an entire modeling experiment that incorporates later tasks like model training and testing.

## Distributing, Building & Testing

The Architect is a Python package distributable via setuptools. It may be installed directly using easy_install or pip, or listed as a dependency of another package (namely triage), under the package name matrix-architect.

To build this package for development, its dependencies may be installed using pip:

pip install -r requirements_dev.txt

(或者,在没有测试和开发依赖关系的情况下,使用requirements.txt)。

并且,为开发而构建,运行测试:

pytest

平台:未知 分类器:开发状态::2-pre-alpha 分类器:目标受众::开发人员 分类器:自然语言:英语 分类器:编程语言::python::3.4

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
使用jaxb2annotateplugin和XJC工具的java自定义注释   java组织。xeustechnologies。jcl无法加载WstxInputFactory类   java JUnit在格式化字符串上比较失败   java Bukkit配置部分getKeys   如何关闭Java流?   java Struts2正则表达式配置   链式事务注释的java奇怪行为   java在两个JButton之间使用变量   java签署APK时内容会发生什么变化?   java LWJGL:Slick:3D世界中的绘图字体   如何分解Java数组?   在Java MySql中处理多个过滤器   java如何在Firebase数据库中跳过初始OnChildaded事件触发   java如何在PreviewView中使用CameraX?   在子类#中重写父类后访问父类原始方法的java已解决   java找不到类型的属性   游戏框架游戏!框架+Java