一种使用sklearn预处理标签Binariz的热编码

2条回答

网友

1楼 · 编辑于 2024-04-25 00:09:08

正如已经说过的，这不是方法的问题。根据documentation：二进制目标转换为列向量。在维数为2的情况下，可以从列向量结果生成所需的数组。

一种直接而简单的方法是：

from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
lb.fit(range(2)  # range(0, 2) is the same as range(2)
a = lb.transform([1, 0])
result_2d = np.array([[item[0], 0 if item[0] else 1] for item in a])

网友

2楼 · 编辑于 2024-04-25 00:09:08

labelBinarizer()根据documentation的目的是

Binarize labels in a one-vs-all fashion
Several regression and binary classification algorithms are available in scikit-learn. A simple way to extend these algorithms to the multi-class classification case is to use > the so-called one-vs-all scheme.

如果您的数据只有两种类型的标签，那么您可以直接将其提供给二进制分类器。因此，一个列足以以一种Vs Rest的方式捕获两个类。

二进制目标转换为列向量

>>> lb = preprocessing.LabelBinarizer()
>>> lb.fit_transform(['yes', 'no', 'no', 'yes'])
array([[1],
       [0],
       [0],
       [1]])

如果您的目的只是创建一个热编码，请使用以下方法。

^{pr2}$

希望这能澄清您的问题：为什么SklearnlabelBinarizer()没有将2类数据转换为两列输出。

相关问题更多 >

编程相关推荐

热门问题

热门文章

一种使用sklearn预处理标签Binariz的热编码

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >