我的因变量充满了0和1。它位于列车标签数据框中,列名为“报价已接受”
train_label.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14400 entries, 0 to 14399
Data columns (total 1 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Offer Accepted 14400 non-null int64
dtypes: int64(1)
memory usage: 112.6 KB
我的自变量包含在列车特性数据框中。有些是分类的(0或1;1、2或3),有些是规则的
train_features.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14400 entries, 0 to 14399
Data columns (total 28 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Customer Number 14400 non-null float64
1 Reward 14400 non-null float64
2 Mailer Type 14400 non-null float64
3 Income Level 14400 non-null float64
4 # Bank Accounts Open 14400 non-null float64
5 Overdraft Protection 14400 non-null float64
6 Credit Rating 14400 non-null float64
7 # Credit Cards Held 14400 non-null float64
8 # Homes Owned 14400 non-null float64
9 Household Size 14400 non-null float64
10 Own Your Home 14400 non-null float64
11 Average Balance 14400 non-null float64
12 Q1 Balance 14400 non-null float64
13 Q2 Balance 14400 non-null float64
14 Q3 Balance 14400 non-null float64
15 Q4 Balance 14400 non-null float64
16 Balance*Income Level 14400 non-null float64
17 Q1*Income Level 14400 non-null float64
18 Q2*Income Level 14400 non-null float64
19 Q3*Income Level 14400 non-null float64
20 Q4*Income Level 14400 non-null float64
21 Balance*Credit 14400 non-null float64
22 Q1*Credit 14400 non-null float64
23 Q2*Credit 14400 non-null float64
24 Q3*Credit 14400 non-null float64
25 Q4*Credit 14400 non-null float64
26 Bank Accounts*Income Level 14400 non-null float64
27 Bank Accounts*Credit 14400 non-null float64
dtypes: float64(28)
memory usage: 3.1 MB
当我跑步时:
from statsmodels.api import OLS
OLS(train_label, train_features).fit().summary()
它生成所需的表。 我的。R平方(不居中)很低,为0.093,因此我想尝试Probit或Logit模型
然而,当我试着跑的时候
import statsmodels.api as sm
sm.Probit(train_label, train_features).fit().summary()
我得到了Lin Alg:奇异矩阵误差
当我跑的时候
from statsmodels.api import OLS
OLS(train_label, train_features[2:3]).fit().summary()
我得到了ValueError:endog和exog的索引没有对齐
我试过这个密码,
train_label.reindex(train_features.index)
但这并没有改变任何事情
你知道我如何进行概率回归吗
目前没有回答
相关问题 更多 >
编程相关推荐