将statsmodels摘要对象转换为Pandas Datafram

X_opt = X[:, [0,1,2,3]] regressor_OLS = sm.OLS(endog= y, exog= X_opt).fit() regressor_OLS.summary() OLS Regression Results ============================================================================== Dep. Variable: y R-squared: 0.951 Model: OLS Adj. R-squared: 0.948 Method: Least Squares F-statistic: 296.0 Date: Wed, 08 Aug 2018 Prob (F-statistic): 4.53e-30 Time: 00:46:48 Log-Likelihood: -525.39 No. Observations: 50 AIC: 1059. Df Residuals: 46 BIC: 1066. Df Model: 3 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ const 5.012e+04 6572.353 7.626 0.000 3.69e+04 6.34e+04 x1 0.8057 0.045 17.846 0.000 0.715 0.897 x2 -0.0268 0.051 -0.526 0.602 -0.130 0.076 x3 0.0272 0.016 1.655 0.105 -0.006 0.060 ============================================================================== Omnibus: 14.838 Durbin-Watson: 1.282 Prob(Omnibus): 0.001 Jarque-Bera (JB): 21.442 Skew: -0.949 Prob(JB): 2.21e-05 Kurtosis: 5.586 Cond. No. 1.40e+06 ==============================================================================

3条回答

网友

1楼 · 编辑于 2024-06-16 10:30:39

来自@Michael B的答案很好，但是需要“重新创建”这个表。表本身实际上可以直接从summary（）.tables属性中获得。此属性中的每个表（表的列表）都是一个SimpleTable，其中有用于输出不同格式的方法。然后我们可以将这些格式中的任何一种读回pd.DataFrame：

import statsmodels.api as sm

model = sm.OLS(y,x)
results = model.fit()
results_summary = results.summary()

# Note that tables is a list. The table at index 1 is the "core" table. Additionally, read_html puts dfs in a list, so we want index 0
results_as_html = results_summary.tables[1].as_html()
pd.read_html(results_as_html, header=0, index_col=0)[0]

网友

2楼 · 编辑于 2024-06-16 10:30:39

一个简单的解决方案只是一行代码：

LRresult = (result.summary2().tables[1])

这将为您提供一个dataframe对象：

type(LRresult)

熊猫.core.frame.DataFrame

要获取有效变量并再次运行测试：

newlist = list(LRresult[LRresult['P>|z|']<=0.05].index)[1:]
myform1 = 'binary_Target' + ' ~ ' + ' + '.join(newlist)

M1_test2 = smf.logit(formula=myform1,data=myM1_1)

result2 = M1_test2.fit(maxiter=200)
LRresult2 = (result2.summary2().tables[1])
LRresult2

网友

3楼 · 编辑于 2024-06-16 10:30:39

将模型拟合存储为变量results，如下所示：

import statsmodels.api as sm
model = sm.OLS(y,x)
results = model.fit()

然后创建如下函数：

def results_summary_to_dataframe(results):
    '''take the result of an statsmodel results table and transforms it into a dataframe'''
    pvals = results.pvalues
    coeff = results.params
    conf_lower = results.conf_int()[0]
    conf_higher = results.conf_int()[1]

    results_df = pd.DataFrame({"pvals":pvals,
                               "coeff":coeff,
                               "conf_lower":conf_lower,
                               "conf_higher":conf_higher
                                })

    #Reordering...
    results_df = results_df[["coeff","pvals","conf_lower","conf_higher"]]
    return results_df

您可以使用dir()来打印results对象的所有属性，然后将它们相应地添加到函数和df中。

相关问题更多 >

编程相关推荐

热门问题

热门文章