我得到一个线性回归模型的索引错误

2024-04-28 19:50:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试对一些数据执行k-fold交叉验证。当X变量被缩放时,代码可以工作,但当保持不变时则不能工作。我想尝试在不缩放数据的情况下运行代码,以查看是否得到不同/更好的结果

我试着评论出定标器,并在网上寻找类似的问题,但没有运气。如果你认为你有办法解决这个问题,那就太棒了

lm = linear_model.LinearRegression()

dataset = pandas.read_csv('machine.csv')
X = dataset.iloc[:, [0]]
y = dataset.iloc[:, 1]

'''
scaler = MinMaxScaler(feature_range=(0, 1))
X = scaler.fit_transform(X)
print("new X=",X)
'''
scores = []
best_svr = SVR(kernel='linear')
cv = KFold(n_splits=10, shuffle=True)
for train_index, test_index in cv.split(X):
    print("Train Index: ", train_index, "\n")
    print("Test Index: ", test_index)

    X_train, X_test, y_train, y_test = X[train_index], X[test_index], y[train_index], y[test_index]
    #y = y_train.ravel()
    #y_train = np.array(y).astype(int) # https://stackoverflow.com/questions/34165731/a-column-vector-y-was-passed-when-a-1d-array-was-expected          
                                    # used this as program expected 1d array


    lm.fit(X_train, y_train)    
    scores.append(lm.score(X_test, y_test))
print(np.mean(scores))

我收到的错误消息是:

Traceback (most recent call last):
  File "D:\Kings\project\linear_regression.py", line 71, in <module>
    X_train, X_test, y_train, y_test = X[train_index], X[test_index], y[train_index], y[test_index]
  File "C:\Users\Alexis'\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\frame.py", line 2934, in __getitem__
    raise_missing=True)
  File "C:\Users\Alexis'\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\indexing.py", line 1354, in _convert_to_indexer
    return self._get_listlike_indexer(obj, axis, **kwargs)[1]
  File "C:\Users\Alexis'\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\indexing.py", line 1161, in _get_listlike_indexer
    raise_missing=raise_missing)
  File "C:\Users\Alexis'\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\indexing.py", line 1246, in _validate_read_indexer
    key=key, axis=self.obj._get_axis_name(axis)))
KeyError: "None of [Int64Index([  0,   1,   2,   5,   7,   8,   9,  10,  11,  13,\n            ...\n            241, 242, 243, 244, 245, 246, 247, 248, 249, 250],\n           dtype='int64', length=225)] are in the [columns]"

Tags: inpytestpandasindexlocallinetrain