试图实现逻辑回归,但gridsearchCV显示的输入变量样本数不一致:[60000,60001]

2024-04-19 23:41:08 发布

您现在位置:Python中文网/ 问答频道 /正文

试图实现逻辑回归,但gridsearchCV显示的输入变量样本数不一致:[60000,60001] 以下是我在python 3环境中的代码:

import joblib
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV
import warnings
warnings.filterwarnings('ignore', category=FutureWarning)
warnings.filterwarnings('ignore', category=DeprecationWarning)

tr_features = pd.read_csv('/home/pranjal/PycharmProjects/train_features.csv')
tr_labels = pd.read_csv('/home/pranjal/PycharmProjects/train_labels.csv', header=None)


        

lr = LogisticRegression()
parameters = {
    'C': [0.001, 0.01, 0.1, 1, 10, 100, 1000]
}

cv = GridSearchCV(lr, parameters, cv=5)
cv.fit(tr_features, tr_labels.values.ravel())

print_results(cv)        

输出运行时错误如下所示:

ValueError                                Traceback (most recent call last)
<ipython-input-20-c836a092d0ab> in <module>
      5 
      6 cv = GridSearchCV(lr, parameters, cv=5)
----> 7 cv.fit(tr_features, tr_labels.values.ravel())
      8 
      9 print_results(cv)

/home/pranjal/snap/jupyter/common/lib/python3.7/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
     71                           FutureWarning)
     72         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 73         return f(**kwargs)
     74     return inner_f
     75 

/home/pranjal/snap/jupyter/common/lib/python3.7/site-packages/sklearn/model_selection/_search.py in fit(self, X, y, groups, **fit_params)
    674             refit_metric = 'score'
    675 
--> 676         X, y, groups = indexable(X, y, groups)
    677         fit_params = _check_fit_params(X, fit_params)
    678 

/home/pranjal/snap/jupyter/common/lib/python3.7/site-packages/sklearn/utils/validation.py in indexable(*iterables)
    291     """
    292     result = [_make_indexable(X) for X in iterables]
--> 293     check_consistent_length(*result)
    294     return result
    295 

/home/pranjal/snap/jupyter/common/lib/python3.7/site-packages/sklearn/utils/validation.py in check_consistent_length(*arrays)
    255     if len(uniques) > 1:
    256         raise ValueError("Found input variables with inconsistent numbers of"
--> 257                          " samples: %r" % [int(l) for l in lengths])
    258 
    259 

ValueError: Found input variables with inconsistent numbers of samples: [60000, 60001]

请帮我调试这个代码


Tags: csvinimporthomelabelsjupytercommonsklearn
1条回答
网友
1楼 · 发布于 2024-04-19 23:41:08

Sklearn需要(n_samples, n_columns)的数据形状。在numpy数组上使用ravel时,生成的形状为(n_samples,)。将其重塑为(n_samples, n_columns)。如果这不起作用,您可以尝试对cv.fit()中的输入使用相同的数据类型,即

cv.fit(tr_features, tr_labels)

因此,要素和标签都是数据帧

相关问题 更多 >