Gridsearch需要30分钟以上的时间，有什么方法可以减少这一点吗？（朱庇特）

from sklearn.svm import SVR from sklearn.model_selection import TimeSeriesSplit from sklearn import svm from sklearn.preprocessing import MinMaxScaler from sklearn import preprocessing as pre X_feature = X_feature.reshape(-1, 1) y_label = y_label.reshape(-1,1) param = [{'kernel': ['rbf'], 'gamma': [1e-2, 1e-3, 1e-4, 1e-5], 'C': [1, 10, 100, 1000]}, {'kernel': ['poly'], 'C': [1, 10, 100, 1000], 'degree': [1, 2, 3, 4]}] reg = SVR(C=1) timeseries_split = TimeSeriesSplit(n_splits=3) clf = GridSearchCV(reg, param, cv=timeseries_split, scoring='neg_mean_squared_error') X= pre.MinMaxScaler(feature_range=(0,1)).fit(X_feature) scaled_X = X.transform(X_feature) y = pre.MinMaxScaler(feature_range=(0,1)).fit(y_label) scaled_y = y.transform(y_label) clf.fit(scaled_X,scaled_y )

2条回答

网友

1楼 · 编辑于 2024-04-27 03:06:15

根据数据大小和分类器的不同，这可能需要很长时间。或者，您可以尝试将进程分成更小的部分，每次只使用一次内核，就像这样

param_rbf = {'kernel': ['rbf'], 'gamma': [1e-2, 1e-3, 1e-4, 1e-5],
                   'C': [1, 10, 100, 1000]}

那就这样用吧

clf = GridSearchCV(reg, param_rbf, cv=timeseries_split, scoring='neg_mean_squared_error')

同样，通过不同的params字典分别对不同的内核进行预测

params_poly = {'kernel': ['poly'], 'C': [1, 10, 100, 1000], 'degree': [1, 2, 3, 4]}

我知道这不完全是一个解决方案，但只是一些建议，以帮助您减少时间，如果可能的话。你知道吗

另外，将verbose选项设置为True。这将帮助您显示分类器的进度。你知道吗

另外，设置n_jobs=-1可能不一定会导致速度降低。See this answer供参考。你知道吗

网友

2楼 · 编辑于 2024-04-27 03:06:15

使用GridSearchCV(..., n_jobs=-1)以并行使用所有可用的CPU核。你知道吗

或者可以使用RandomizedSearchCV

相关问题更多 >

编程相关推荐

热门问题

热门文章