XGBoost`.predict()`是否没有标签或目标列?

2024-04-30 00:41:55 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图使用XGBoost来预测测试数据集没有标签的数据集。如何使xgboost模型在未提供目标列时不会失败

# TRAIN_DATA looks similar to TEST_DATA except TEST_DATA does not have a `target` column

import xgboost as xgb
# read in data
dtrain = xgb.DMatrix(TRAIN_DATA, label=TRAIN_DATA.target)
dtest = xgb.DMatrix(TEST_DATA)
# specify parameters via map
param = {'max_depth':2, 'eta':1, 'objective':'binary:logistic' }
num_round = 2
bst = xgb.train(param, dtrain, num_round)
# make prediction
preds = bst.predict(dtest)

输出:

                raise ValueError(msg.format(self.feature_names,
>                                           data.feature_names))
E               ValueError: feature_names mismatch: ['geohash', 'uupm', 'driver_supply', 'requested_at', 'target'] ['geohash', 'uupm', 'driver_supply', 'requested_at']
E               expected target in input data

../venvs/venv3/lib/python3.6/site-packages/xgboost/core.py:1541: ValueError

Tags: intesttargetdataparamnamestrainfeature
1条回答
网友
1楼 · 发布于 2024-04-30 00:41:55

请尝试以下代码:

选择除target列以外的所有列,如下所示

TRAIN_DATA[TRAIN_DATA.columns.difference(['target'])]

您可以更改代码以培训功能:

dtrain = xgb.DMatrix(TRAIN_DATA[TRAIN_DATA.columns.difference(['target'])], label=TRAIN_DATA.target)

理想情况下,对于培训,您不应该为特性公开target

你可以毫无疑问地进行推理

相关问题 更多 >