我发现了错误
[LightGBM] [Fatal] Check failed: (train_data->num_features()) > (0)
对于具有形状(40,7)的数据集X。我正在尝试为自定义损失函数运行梯度增强
如有任何解决方案或提示,将不胜感激
线路上出现了错误
gbm.fit(
X_train,
y_train,
eval_set=[(X_valid, y_valid)],
eval_metric=custom_asymmetric_valid,
verbose=False,
)
以下是完整的代码:
import lightgbm
import pandas as pd
from sklearn.model_selection import train_test_split
import numpy as np
train = pd.read_csv("Data_Train.csv")
X, y = train.iloc[:, 1:-1], train.iloc[:, -1]
X_train, X_valid, y_train, y_valid = train_test_split(X, y, test_size=0.20, random_state=42)
print(np.shape(X_train),np.shape(X_valid))
test = pd.read_csv("Data_Test.csv")
X_test, y_test = test.iloc[:, 1:-1], test.iloc[:, -1]
# Defining custom loss function
def custom_asymmetric_train(y_true, y_pred):
residual = (y_true - y_pred).astype("float")
grad = np.where(residual<0, -2*10.0*residual, -2*residual)
hess = np.where(residual<0, 2*10.0, 2.0)
return grad, hess
def custom_asymmetric_valid(y_true, y_pred):
residual = (y_true - y_pred).astype("float")
loss = np.where(residual < 0, (residual**2)*10.0, residual**2)
return "custom_asymmetric_eval", np.mean(loss), False
# default lightgbm model with sklearn api
gbm = lightgbm.LGBMRegressor(random_state=33)
# updating objective function to custom
# default is "regression"
# also adding metrics to check different scores
gbm.set_params(**{'objective': custom_asymmetric_train}, metrics = ["mse", 'mae'])
# fitting model
gbm.fit(
X_train,
y_train,
eval_set=[(X_valid, y_valid)],
eval_metric=custom_asymmetric_valid,
verbose=False,
)
y_pred = gbm.predict(X_valid)
# create dataset for lightgbm
lgb_train = lgb.Dataset(X_train, y_train, free_raw_data=False)
lgb_eval = lgb.Dataset(X_valid, y_valid, reference=lgb_train, free_raw_data=False)
params = {
'objective': 'regression',
'verbose': 0
}
gbm = lgb.train(params,
lgb_train,
num_boost_round=10,
init_model=gbm,
fobj=custom_asymmetric_train,
feval=custom_asymmetric_valid,
valid_sets=lgb_eval)
y_pred = gbm.predict(X_valid)
您的原始示例不是完全可复制的(因为
"Data_Train.csv"
的内容没有共享),但我可以使用LightGBM 3.1.1(随pip install lightgbm
安装)可靠地用以下代码复制您提到的错误消息LightGBM具有一些用于防止过度装配的参数。在这种情况下,有两个是相关的:
默认情况下,在构建} )
Dataset
对象的过程中,LightGBM会过滤掉无法基于这些条件拆分的功能(请参见^{LightGBM的参数默认值旨在在中等大小的数据集上提供良好的性能。形状为
(40, 7)
的数据集非常小,这增加了所有功能无法使用的风险为了适应这样一个小的数据集,您可以覆盖默认值并将其设置为0或更小的值。下面的代码训练成功,没有错误
相关问题 更多 >
编程相关推荐