使用dfply@dfpipe创建函数时出错

2024-04-30 00:38:45 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据集“banks”,如果我对一个列名称“jobs”进行分组,以检查每个类别中的计数,我可以找到以下内容:

^{tb1}$

我还添加了我正在使用的数据集的前3行: 年龄、工作、婚姻、教育、违约、余额、住房、贷款、联系方式、日期、月份、期限、活动、pdays、上一次、poutcome、y 30,失业,已婚,小学,号码1787,号码,号码,手机,19,10月,79,1,-1,0,未知,号码 33、服务、已婚、中学、否、4789、是、是、手机、5月11日、220、1339、4、失败、否 35,管理,单一,三级,否,1350,是,否,蜂窝,16,4月,185,1330,1,故障,否

我的意图是创建一个可以用于其他列的小函数,因此我尝试使用“dfply”包创建一个函数

import pandas as pd
import dfply
from dfply import *

#creating the function

@dfpipe
def woe_iv(df,variable):
    step1=df>>group_by(X.variable)>>summarize(COUNT=X.variable.count())
    return step1

#invoking the function

banks>>woe_iv(X.job)

但是,这段代码给了我一个错误,说明如下:

@dfpipe

def woe_iv(df,variable):
            
            step1=df>>group_by(X.variable)>>summarize(COUNT=X.variable.count())
            return step1
banks>>woe_iv(X.job)
Traceback (most recent call last):

  File "<ipython-input-46-d851aeac1927>", line 7, in <module>
    banks>>woe_iv(X.job)

  File "/opt/anaconda3/lib/python3.8/site-packages/dfply/base.py", line 142, in __rrshift__
    result = self.function(other_copy)

  File "/opt/anaconda3/lib/python3.8/site-packages/dfply/base.py", line 149, in <lambda>
    return pipe(lambda x: self.function(x, *args, **kwargs))

  File "/opt/anaconda3/lib/python3.8/site-packages/dfply/base.py", line 329, in __call__
    return self.function(*args, **kwargs)

  File "/opt/anaconda3/lib/python3.8/site-packages/dfply/base.py", line 282, in __call__
    return self.function(df, *args, **kwargs)

  File "<ipython-input-46-d851aeac1927>", line 5, in woe_iv
    step1=df>>group_by(X.variable)>>summarize(COUNT=X.variable.count())

  File "/opt/anaconda3/lib/python3.8/site-packages/dfply/base.py", line 142, in __rrshift__
    result = self.function(other_copy)

  File "/opt/anaconda3/lib/python3.8/site-packages/dfply/base.py", line 149, in <lambda>
    return pipe(lambda x: self.function(x, *args, **kwargs))

  File "/opt/anaconda3/lib/python3.8/site-packages/dfply/base.py", line 279, in __call__
    args = self._recursive_arg_eval(df, args[1:])

  File "/opt/anaconda3/lib/python3.8/site-packages/dfply/base.py", line 241, in _recursive_arg_eval
    return [

  File "/opt/anaconda3/lib/python3.8/site-packages/dfply/base.py", line 242, in <listcomp>
    self._symbolic_to_label(df, a) if i in eval_as_label

  File "/opt/anaconda3/lib/python3.8/site-packages/dfply/base.py", line 231, in _symbolic_to_label
    return self._evaluator_loop(df, arg, self._evaluate_label)

  File "/opt/anaconda3/lib/python3.8/site-packages/dfply/base.py", line 225, in _evaluator_loop
    return eval_func(df, arg)

  File "/opt/anaconda3/lib/python3.8/site-packages/dfply/base.py", line 181, in _evaluate_label
    arg = self._evaluate(df, arg)

  File "/opt/anaconda3/lib/python3.8/site-packages/dfply/base.py", line 175, in _evaluate
    arg = arg.evaluate(df)

  File "/opt/anaconda3/lib/python3.8/site-packages/dfply/base.py", line 71, in evaluate
    return self.function(context)

  File "/opt/anaconda3/lib/python3.8/site-packages/dfply/base.py", line 74, in <lambda>
    return Intention(lambda x: getattr(self.function(x), attribute),

  File "/opt/anaconda3/lib/python3.8/site-packages/pandas/core/generic.py", line 5139, in __getattr__
    return object.__getattribute__(self, name)

AttributeError: 'DataFrame' object has no attribute 'variable'

如果我遗漏了什么,请告诉我


Tags: inpyselfdfbasereturnlibpackages
1条回答
网友
1楼 · 发布于 2024-04-30 00:38:45

Shameek Mukherjee,这是对示例代码的正确解释和缩进吗?除了压痕,我找不到任何区别

import dfply
from dfply import *

@dfpipe
def woe_iv(df,variable):
    step1 = df>>group_by(X.variable)>>summarize(COUNT=X.variable.count())
    return step1

banks>>woe_iv(X.job)

第二个例子:

@dfpipe
def woe_iv(df,variable):
    step1 = df>>group_by(X.variable)>>summarize(COUNT=X.variable.count())
    return step1

banks>>woe_iv(X.job)

相关问题 更多 >