我一直在尝试使用DataFrameMapper
将数据帧上的多个预处理转换添加到scikit学习管道中。在
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.data"
names = ['Sex', 'Length', 'Diameter', 'Height', 'Whole weight', 'Schuked weight', 'Viscera weight', 'Shell weight', 'Rings']
df = pd.read_csv(url, names=names)
mapper = DataFrameMapper(
[('Height', Normalizer()), ('Sex', LabelBinarizer())]
)
stages = []
stages += [("mapper", mapper)]
estimator = DecisionTreeClassifier()
stages += [("dtree", estimator)]
pipeline = Pipeline(stages)
labelCol = 'Rings'
target = df[labelCol]
data = df.drop(labelCol, axis=1)
train_data, test_data, train_target, expected = train_test_split(data, target, test_size=0.25, random_state=33)
model = pipeline.fit(train_data, train_target)
但是,我得到了以下错误:
^{pr2}$我错过了什么?在
谢谢:)
您必须更改
DataFrameMapper
的结构:这是一个微妙的细节,可以在sklearn_pandas的文档中找到:
相关问题 更多 >
编程相关推荐