我正在运行以下源代码:
import statsmodels.formula.api as sm
# Add one column of ones for the intercept term
X = np.append(arr= np.ones((50, 1)).astype(int), values=X, axis=1)
regressor_OLS = sm.OLS(endog=y, exog=X).fit()
print(regressor_OLS.summary())
在哪里
X
是一个50x5(在添加截距项之前)numpy数组,如下所示:
并且y
是一个50x1numpy数组,其因变量具有浮点值。在
前两列用于具有三个不同值的伪变量。其余列是三个不同的独立变量。在
尽管,据说statsmodels.formula.api.OLS
会自动添加一个截距项(参见@stellacia的回答:OLS using statsmodel.formula.api versus statsmodel.api),但它的summary
没有显示截距项的统计值,这在我的例子中很明显:
OLS Regression Results
==============================================================================
Dep. Variable: Profit R-squared: 0.988
Model: OLS Adj. R-squared: 0.986
Method: Least Squares F-statistic: 727.1
Date: Sun, 01 Jul 2018 Prob (F-statistic): 7.87e-42
Time: 21:40:23 Log-Likelihood: -545.15
No. Observations: 50 AIC: 1100.
Df Residuals: 45 BIC: 1110.
Df Model: 5
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
x1 3464.4536 4905.406 0.706 0.484 -6415.541 1.33e+04
x2 5067.8937 4668.238 1.086 0.283 -4334.419 1.45e+04
x3 0.7182 0.066 10.916 0.000 0.586 0.851
x4 0.3113 0.035 8.885 0.000 0.241 0.382
x5 0.0786 0.023 3.429 0.001 0.032 0.125
==============================================================================
Omnibus: 1.355 Durbin-Watson: 1.288
Prob(Omnibus): 0.508 Jarque-Bera (JB): 1.241
Skew: -0.237 Prob(JB): 0.538
Kurtosis: 2.391 Cond. No. 8.28e+05
==============================================================================
为此,我在源代码中添加了一行:
X = np.append(arr= np.ones((50, 1)).astype(int), values=X, axis=1)
正如您在我的文章开头看到的,截距/常数的统计值如下所示:
OLS Regression Results
==============================================================================
Dep. Variable: Profit R-squared: 0.951
Model: OLS Adj. R-squared: 0.945
Method: Least Squares F-statistic: 169.9
Date: Sun, 01 Jul 2018 Prob (F-statistic): 1.34e-27
Time: 20:25:21 Log-Likelihood: -525.38
No. Observations: 50 AIC: 1063.
Df Residuals: 44 BIC: 1074.
Df Model: 5
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 5.013e+04 6884.820 7.281 0.000 3.62e+04 6.4e+04
x1 198.7888 3371.007 0.059 0.953 -6595.030 6992.607
x2 -41.8870 3256.039 -0.013 0.990 -6604.003 6520.229
x3 0.8060 0.046 17.369 0.000 0.712 0.900
x4 -0.0270 0.052 -0.517 0.608 -0.132 0.078
x5 0.0270 0.017 1.574 0.123 -0.008 0.062
==============================================================================
Omnibus: 14.782 Durbin-Watson: 1.283
Prob(Omnibus): 0.001 Jarque-Bera (JB): 21.266
Skew: -0.948 Prob(JB): 2.41e-05
Kurtosis: 5.572 Cond. No. 1.45e+06
==============================================================================
为什么当我不给自己添加截距项时,截距的统计值没有显示出来,尽管据说{
“除非使用公式,否则模型不会添加常量。” 所以试试下面的例子。应根据数据集定义变量名。在
使用
而不是
^{pr2}$相关问题 更多 >
编程相关推荐