熊猫不适合线性回归

2024-04-27 04:53:01 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图从一个数据帧中对一些数据进行回归分析,但是我一直得到这个奇怪的形状错误。你知道怎么了吗?你知道吗

import pandas as pd
import io
import requests
import statsmodels.api as sm

# Read in a dataset 
url="https://raw.githubusercontent.com/jldbc/coffee-quality-database/master/data/arabica_data_cleaned.csv"
s=requests.get(url).content
df=pd.read_csv(io.StringIO(s.decode('utf-8')))

# Select feature columns 
X = df[['Body', 'Clean.Cup']]

# Select dv column
y = df['Cupper.Points']

# make model
mod = sm.OLS(X, y).fit()

我得到这个错误: 形状(1311,2)和(1311,2)未对齐:2(尺寸1)!=1311(尺寸0)


Tags: csv数据ioimporturldfdata尺寸
2条回答

y和X的顺序是错误的。你知道吗

sm.OLS(y,X)

sm.OLS命令中,Xy术语的顺序错误:

import pandas as pd
import io
import requests
import statsmodels.api as sm

# Read in a dataset 
url="https://raw.githubusercontent.com/jldbc/coffee-quality-database/master/data/arabica_data_cleaned.csv"
s=requests.get(url).content
df=pd.read_csv(io.StringIO(s.decode('utf-8')))

# Select feature columns 
X = df[['Body', 'Clean.Cup']]

# Select dv column
y = df['Cupper.Points']

# make model
mod = sm.OLS(y, X).fit()

mod.summary()

运行和返回

<class 'statsmodels.iolib.summary.Summary'>
"""
                            OLS Regression Results                            
==============================================================================
Dep. Variable:          Cupper.Points   R-squared:                       0.998
Model:                            OLS   Adj. R-squared:                  0.998
Method:                 Least Squares   F-statistic:                 3.145e+05
Date:                Sat, 06 Jul 2019   Prob (F-statistic):               0.00
Time:                        19:42:59   Log-Likelihood:                -454.94
No. Observations:                1311   AIC:                             913.9
Df Residuals:                    1309   BIC:                             924.2
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
                                       
Body           0.8464      0.016     53.188      0.000       0.815       0.878
Clean.Cup      0.1154      0.012      9.502      0.000       0.092       0.139
==============================================================================
Omnibus:                      537.879   Durbin-Watson:                   1.710
Prob(Omnibus):                  0.000   Jarque-Bera (JB):            30220.027
Skew:                           1.094   Prob(JB):                         0.00
Kurtosis:                      26.419   Cond. No.                         26.2
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
"""

相关问题 更多 >