仅伽马回归间期

2024-04-20 10:55:02 发布

您现在位置:Python中文网/ 问答频道 /正文

我是python新手我正在尝试做一个gamma回归,我希望能得到与R相似的估计,但是我不能理解python的语法并且它会产生一个错误,有一些如何解决它的想法。在

我的R代码:

set.seed(1)
y = rgamma(18,10,.1)
print(y)
[1]  76.67251 140.40808 138.26660 108.20993  53.46417 110.61754 119.11950 113.57558  85.82045  71.96892
[11]  76.81693  86.00139  93.62010  69.49795 121.99775 114.18707 125.43608 120.63640

# Option 1
model = glm(y~1,family=Gamma)
summary(model)

# Option 2
# x = rep(1,18)
# summary(glm(y~x,family=Gamma))

输出:

^{pr2}$

Python代码

y = [76.67251,140.40808,138.26660,108.20993,53.46417,110.61754,
 119.11950,113.57558,85.82045,71.96892,76.81693,86.00139,
 93.62010,69.49795,121.99775,114.18707,125.43608,120.63640]

x = np.repeat(1,18)

import numpy
import statsmodels.api as sm

model = sm.GLM(x,y, family=sm.families.Gamma()).fit()
print(model.summary())

我期望输出类似于R


Tags: 代码importmodel错误语法summaryfamilyoption
2条回答

这是另一种使用公式的方法,为此您需要导入statsmodels.formula.api

import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf

y = [76.67251,140.40808,138.26660,108.20993,53.46417,110.61754,
 119.11950,113.57558,85.82045,71.96892,76.81693,86.00139,
 93.62010,69.49795,121.99775,114.18707,125.43608,120.63640]

df = pd.DataFrame({'y':y})

model = smf.glm(formula = 'y ~ 1', data = df, family=sm.families.Gamma()).fit()
model.summary()
<class 'statsmodels.iolib.summary.Summary'>
"""
                 Generalized Linear Model Regression Results                  
==============================================================================
Dep. Variable:                      y   No. Observations:                   18
Model:                            GLM   Df Residuals:                       17
Model Family:                   Gamma   Df Model:                            0
Link Function:          inverse_power   Scale:                        0.062556
Method:                          IRLS   Log-Likelihood:                -83.656
Date:                Sun, 20 May 2018   Deviance:                       1.1761
Time:                        22:00:54   Pearson chi2:                     1.06
No. Iterations:                     6   Covariance Type:             nonrobust
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
                                       
Intercept      0.0099      0.001     16.963      0.000       0.009       0.011
==============================================================================
"""

您需要更改python代码中x和y变量的顺序,然后您将看到完全相同的结果(尽管输出中显示的有效位数与R中的输出不同:

 sm.GLM(y,x, family=sm.families.Gamma()).fit().summary()

<class 'statsmodels.iolib.summary.Summary'>
"""
                 Generalized Linear Model Regression Results
==============================================================================
Dep. Variable:                      y   No. Observations:                   18
Model:                            GLM   Df Residuals:                       17
Model Family:                   Gamma   Df Model:                            0
Link Function:          inverse_power   Scale:                 0.0625558699706
Method:                          IRLS   Log-Likelihood:                -83.656
Date:                Sun, 20 May 2018   Deviance:                       1.1761
Time:                        17:59:04   Pearson chi2:                     1.06
No. Iterations:                     4
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
                                       
const          0.0099      0.001     16.963      0.000       0.009       0.011
==============================================================================
"""

各种python包都有自己的语法。下面是一个很好的链接,其中包含一些如何在Python中使用公式语法的示例: http://www.statsmodels.org/dev/example_formulas.htmlenter link description here

相关问题 更多 >