Python deepair-encoder包_程序模块 - PyPI

这是一个用于编码的子模块包

deepair-encoder的Python项目详细描述

深空编码器

此包用于对计算机兼容数据帧的数据字段进行编码。

包装结构

deepair_encoder

——编码器.py -init.py ——公用事业 ——编码器工具.py -init.py ——记录器.py ——拆分器.py

1个目录，6个文件

依赖关系

注意：运行此包需要以下python3包：

努比
scipy
熊猫
sklearn
制表
全面质量管理

函数声明

下面是可用于DeepAir-dev的包中函数的签名。

编码器.py

下面是可以通过将此模块导入为from deepair_encoder.encoder import <function_name>来访问的函数。

encode_username：

def encode_username(df, drop_field=True):
    '''
        Username one-hot encoder.
        inputs:
            df: Dataframe which have username column (pandas df series)
            drop_field: a flag if the usename column should be dropped or not after encoding (bool)
        return:
            df: Dataframe which have only username indicator [0/1] (pandas df series)
    '''

encode_discounts：

def encode_discounts(df, drop_field=True):
    '''
        Discounts encoder.
        inputs:
            df: dataframe which has discounts column (pandas df series)
            drop_field: a flag if the discounts column should be dropped or not after encoding (bool)
        return:
            df: a dataframe with 3 new columns 'PROMOCODE', 'RES', 'LFG' and discounts droped if drop_field = True
    '''

minmax_score：

def minmax_score(df, fields, keyval=['requestid', 'direction_onward'], drop_field=True, verbose=True):
    '''
        scorer function to normalize from 0-1 inversely to the value.
        inputs:
            df: Dataframe which contains fields (pandas df series)
            fields: Dataframe columns on which the score is calculated (list)
            keyval: Group the data by this key values (list)
            drop_field: Indicator to replace fields by new scored fields (bool)
            verbose: Progress bar indicator (bool)
        return:
            field: Dataframe with a score for totalprice relative to whole requestid-direction, by index, by faregroup  (pandas df series)
    '''

minmax_normalize：

def minmax_normalize(df, fields, keyval=['requestid', 'direction_onward'], drop_field=True, verbose=True):
    '''
        normalizer function to normalize from 0-1 proportional to the value.
        inputs:
            df: Dataframe which contains fields (pandas df series)
            fields: Dataframe columns on which the normalication is calculated (list)
            keyval: Group the data by this key values (list)
            drop_field: Indicator to replace fields by normalized new fields (bool)
            verbose: Progress bar indicator (bool)
        return:
            field: Dataframe with a score for totalprice relative to whole requestid-direction, by index, by faregroup  (pandas df series)
    '''

encode_totalprice：

def encode_totalprice(df):
    '''
        total price encoder.
        inputs:
            df: Dataframe which have totalprice and totaltaxes column (pandas df series)
        return:
            df: Dataframe with a score for totalprice relative to whole requestid-direction, by index, by faregroup  (pandas df series)
    '''

encode_bookingid：

def encode_bookingid(df):
    '''
        Bookingid flag one-hot encoder.
        inputs:
            df: Dataframe which have Bookingid column (pandas df series)
        return:
            df: Dataframe which have only Bookingid indicator [0/1] (pandas df series)
    '''

encode_faregroup：

def encode_faregroup(raw_file, df, verbose=False):
    '''
        Faregroup encoder.
        inputs:
            raw_file: raw_file location where the faregroup_definition is available
            df: Dataframe which have faregroup column (pandas df series)
            verbose: Optional, if more detailed log is needed
        return:
            field: Dataframe with extra fields with faregroup attributes [-1/0/1] (pandas df series)
            Note: -1 = not available, 0 = available for a fee, 1 = available for free (at no charge)
    '''

encode_datetime：

def encode_datetime(df, field, verbose=False):
    '''
        Datetime encoder.
        inputs:
            df: Dataframe which have 'field' column (pandas df series)
            field: the field that has to be encoded
        return:
            df: Dataframe with 6 ecoded values capturing TOD, DOW and WOY
    '''

encode_advanced_purchase：

def encode_advanced_purchase(df, dptr_field='departuredate', sales_field='utctimestamp'):
    '''
        advanced purchase encoder.
        inputs:
            df: Dataframe which have dptr_field and sales_field column (pandas df series)
        return:
            df: Dataframe advanced purchase column
    '''

encode_los：

def encode_los(df, verbose=False):
    '''
        los and trip_type encoder.
        inputs:
            df: Dataframe which have departuredate, requestid and direction column (pandas df series)
        return:
            df: Dataframe los and trip_type encodes column
    '''

encode_airports：

def encode_airports(df, fields, reference_file_path, verbose=True):
    '''
        encode the airports.
        inputs:
            df                      : Dataframe which have airports (pandas df series)
            fields                  : list of fields that you want to run this
                                        function on (list)
            reference_file_path     : path to the codes file (string)
            verbose                 : Indicator for progress bar (bool)
        return:
            df: Dataframe encoded with airports values
    '''

encode_city：

def encode_city(df, fields, reference_file_path, verbose=True):
    '''
        encode the airports.
        inputs:
            df                      : Dataframe which have city (pandas df series)
            fields                  : list of fields that you want to run this
                                        function on (list)
            reference_file_path     : path to the codes file (string)
            verbose                 : Indicator for progress bar (bool)
        return:
            df: Dataframe encoded with airports values
    '''

utils

此子包包含低于encoder模块级别的工具，即编码器使用的那些模块。

编码器工具

下面是可以通过将此模块导入为from deepair_encoder.utils.encoder_tools import <function_name>来访问的函数。

one_hot_encoder：

def one_hot_encoder(df, fields, fields_drop=True, verbose=True, classes=None):
    '''
        Converts the fields into one hot encoding.
        inputs:
            df: Dataframe containing all those fields (pandas df)
            fields: Dataframe columns that you want to convert to one hot (string)
            verbose: Indicator for progress bar (bool)
            classes: Dictionary for the desired columns (dict)
        return:
            df: updated dataframe (pandas df)
    '''

integer_encoder：

def integer_encoder(df, fields, verbose=True):
    '''
        Converts the fields into integer encoding.
        inputs:
            df: Dataframe containing all those fields (pandas df)
            fields: Dataframe columns that you want to convert to one hot (string)
            verbose: Indicator for progress bar (bool)
        returns:
            df: updated dataframe (pandas df)
    '''

get_wom：

def get_wom(field):
    '''
        Function for week of the month.
        inputs:
            field: Dataframe column which have timestamp (pandas df series)
        return:
            field: Dataframe column which have wom (pandas df series)
    '''

get_dow：

def get_dow(field):
    '''
        Function for day of the week.
        inputs:
            field: Dataframe column which have timestamp (pandas df series)
        return:
            field: Dataframe column which have dow (pandas df series)
    '''

get_month：

def get_month(field):
    '''
        Function for month.
        inputs:
            field: Dataframe column which have timestamp (pandas df series)
        return:
            field: Dataframe column which have only month (pandas df series)
    '''

get_year：

def get_year(field):
    '''
        Function for year.
        inputs:
            field: Dataframe column which have timestamp (pandas df series)
        return:
            field: Dataframe column which have only year (pandas df series)
    '''

obj2num：

def obj2num(df, fields, verbose=True):
    '''
        Converts the fields into numeric from obj data type.
        inputs:
            df: Dataframe containing all those fields (pandas df)
            fields: Dataframe columns that you want to convert to numeric (string)
            verbose: Indicator for progress bar (bool)
        returns:
            df: updated dataframe (pandas df)
    '''

记录器

下面是可以通过将此模块导入为from deepair_encoder.utils.logger import <function_name>来访问的函数。

unique_stats：

def unique_stats(df):

_log_with_timestamp：

def _log_with_timestamp(message):
    '''
        prints message on console
        input :
            message     : msg to print (string)
    '''

分离器

下面是可以通过将此模块导入为from deepair_encoder.utils.splitters import <function_name>来访问的函数。

type_splitter：

def type_splitter(data, keys=['passengertypes'], newheadings=[['ADT', 'CHD', 'INF']]):
    '''
        Split the column (need to make it more generic).
        inputs:
            data: dataframe which contain this column (pd dataframe)
            keys: list of column fields to split(list)
            newheadings: list of list for header columns in keys (list)
        returns:
            data: updated dataframe (pandas df)
    '''

split_datewise：

def split_datewise(df, directory='', postfix='', mode='w'):
    '''
        Split a Dataframe according to dates in UTC-Timestamp:

        input :-
            DF          : dataframe which contain this column (pd dataframe)
            directory   : path to the target directory (string)
            postfix     : any postfix you want to add (string)
            mode        : writing mode ('w':writing, 'a': append)
    '''

欢迎加入QQ群-->： 979659372

deepair-encoder 0.0.9

deepair-encoder的Python项目详细描述

深空编码器

包装结构

依赖关系

函数声明

编码器.py

utils

分离器

推荐PyPI第三方库

elasticsearchdbapi

cartprograph

mlogic

rubix-cube

gan-lab

kubernetesvalidate

pyobjcframeworkcalendarstore

photosynthesis-metrics

rsmf

jsonapiclient

adevinta-yapo-bi-connect-db

easytxt

arbdlib

pyobjcframeworkautomaticassessmentconfiguration

aiomongoengine

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

deepair-encoder 0.0.9

deepair-encoder的Python项目详细描述

深空编码器

包装结构

依赖关系

函数声明

编码器.py

utils

记录器 下面是可以通过将此模块导入为from deepair_encoder.utils.logger import <function_name>来访问的函数。unique_stats：def unique_stats(df): _log_with_timestamp：def _log_with_timestamp(message): ''' prints message on console input : message : msg to print (string) '''

分离器

推荐PyPI第三方库

elasticsearchdbapi

cartprograph

mlogic

rubix-cube

gan-lab

kubernetesvalidate

pyobjcframeworkcalendarstore

photosynthesis-metrics

rsmf

jsonapiclient

adevinta-yapo-bi-connect-db

easytxt

arbdlib

pyobjcframeworkautomaticassessmentconfiguration

aiomongoengine

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签