Python statsmodels Probit和Logit产生错误,而OLS工作正常吗?

2024-04-29 20:09:59 发布

您现在位置:Python中文网/ 问答频道 /正文

我的因变量充满了0和1。它位于列车标签数据框中,列名为“报价已接受”

train_label.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14400 entries, 0 to 14399
Data columns (total 1 columns):
 #   Column          Non-Null Count  Dtype
---  ------          --------------  -----
 0   Offer Accepted  14400 non-null  int64
dtypes: int64(1)
memory usage: 112.6 KB

我的自变量包含在列车特性数据框中。有些是分类的(0或1;1、2或3),有些是规则的

train_features.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14400 entries, 0 to 14399
Data columns (total 28 columns):
 #   Column                      Non-Null Count  Dtype  
---  ------                      --------------  -----  
 0   Customer Number             14400 non-null  float64
 1   Reward                      14400 non-null  float64
 2   Mailer Type                 14400 non-null  float64
 3   Income Level                14400 non-null  float64
 4   # Bank Accounts Open        14400 non-null  float64
 5   Overdraft Protection        14400 non-null  float64
 6   Credit Rating               14400 non-null  float64
 7   # Credit Cards Held         14400 non-null  float64
 8   # Homes Owned               14400 non-null  float64
 9   Household Size              14400 non-null  float64
 10  Own Your Home               14400 non-null  float64
 11  Average Balance             14400 non-null  float64
 12  Q1 Balance                  14400 non-null  float64
 13  Q2 Balance                  14400 non-null  float64
 14  Q3 Balance                  14400 non-null  float64
 15  Q4 Balance                  14400 non-null  float64
 16  Balance*Income Level        14400 non-null  float64
 17  Q1*Income Level             14400 non-null  float64
 18  Q2*Income Level             14400 non-null  float64
 19  Q3*Income Level             14400 non-null  float64
 20  Q4*Income Level             14400 non-null  float64
 21  Balance*Credit              14400 non-null  float64
 22  Q1*Credit                   14400 non-null  float64
 23  Q2*Credit                   14400 non-null  float64
 24  Q3*Credit                   14400 non-null  float64
 25  Q4*Credit                   14400 non-null  float64
 26  Bank Accounts*Income Level  14400 non-null  float64
 27  Bank Accounts*Credit        14400 non-null  float64
dtypes: float64(28)
memory usage: 3.1 MB

当我跑步时:

from statsmodels.api import OLS
OLS(train_label, train_features).fit().summary()

它生成所需的表。 我的。R平方(不居中)很低,为0.093,因此我想尝试Probit或Logit模型

然而,当我试着跑的时候

import statsmodels.api as sm
sm.Probit(train_label, train_features).fit().summary()

我得到了Lin Alg:奇异矩阵误差

当我跑的时候

from statsmodels.api import OLS
OLS(train_label, train_features[2:3]).fit().summary()

我得到了ValueError:endog和exog的索引没有对齐 我试过这个密码, train_label.reindex(train_features.index) 但这并没有改变任何事情

你知道我如何进行概率回归吗


Tags: columnstrainlevelnulllabelbankfeaturesbalance