尝试在数据框上应用函数来计算分数?

2024-04-29 14:13:32 发布

您现在位置:Python中文网/ 问答频道 /正文

我创建了下面给出的用户定义函数,并尝试应用于DataFrame,但出现错误:-“TypeError:(“scoreq()缺少3个必需的位置参数:'ADVTG\u TRGT\u INC'、'AGECD'和'PPXPRI','发生在索引ADVNTG\u STAT')”

def scoreq(PCT_NO_OPEN_TRDLN, ADVTG_TRGT_INC, AGECD, PPXPRI):
        scoreq += -0.3657
        scoreq += (ADVNTG_MARITAL_STAT in ('2'))*-0.039
        scoreq += (ADVTG_TRGT_INC in ('7','6','5','4'))*0.1311
        scoreq += (AGECD in ('7','2'))*-0.1254
        scoreq += (PPXPRI in (-1))*-0.1786
        return scoreq
        
df_3Var['scoreq'] = df_3Var.apply(scoreq)

"TypeError: ("scoreq() missing 3 required positional arguments: 'ADVTG_TRGT_INC', 'AGECD', and 'PPXPRI'", 'occurred at index ADVNTG_MARITAL_STAT')"
 


df_3Var:- 
    ADVNTG_MARITAL_STAT   ADVTG_TRGT_INC    AGECD   PPXPRI
0                     1                5        6       -1
1                     2                6        5       -1
2                     1                2        2       -1
3                     2                7        6      133
4                     2                1        3       75

Tags: 函数用户indf定义statinctypeerror
2条回答

您在scoreq函数中使用了列名作为参数,但它不是这样工作的。它应该接收常规参数

您有两个选项:将整行发送到scoreq,或仅发送相关值:

def scoreq(row):
        scoreq = row["...."]
        ...
        return scoreq

df_3Var['scoreq'] = df_3Var.apply(scoreq)

或仅直接发送值:

df_3Var['scoreq'] = df_3Var.apply(lambda row: scoreq(row["..."], row["..."]))

此外,您可能希望将scoreq函数中的数字作为数字而不是字符串处理:例如scoreq += (row["PPXPRI"]==(-1))*-0.1786和不in

“apply”调用的函数应接受行或列。这是一个有效的实现

请注意,您还应:

  • 初始化scoreq
  • 将值视为数字而不是字符串
  • 对列表而不是元组使用“in”
    def scoreq(row):
        scoreq = 0 # you need to initialize this variable. 
        scoreq += -0.3657
        scoreq += (row["ADVNTG_MARITAL_STAT"] == 2)*-0.039
        scoreq += (row["ADVTG_TRGT_INC"] in [7,6,5,4])*0.1311
        scoreq += (row["AGECD"] in [7,2])*-0.1254
        scoreq += (row["PPXPRI"] == -1)*-0.1786
        return scoreq
            
    df_3Var['scoreq'] = df_3Var.apply(scoreq, axis=1)

相关问题 更多 >