如何在Pandas系列列表中使用OneHotEncoder？

import pandas as pd import numpy as np d = {'A': [[5,7], [3, 4, 5], [2], [1,2,3,4]]} df = pd.DataFrame(data=d) df A 0 [5, 7] 1 [3, 4, 5] 2 [2] 3 [1, 2, 3, 4] a = np.array(df['A']) a array([list([5, 7]), list([3, 4, 5]), list([2]), list([1, 2, 3, 4])], dtype=object) from sklearn.preprocessing import OneHotEncoder enc = OneHotEncoder(sparse = False) X = enc.fit_transform(a) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-47-64181a9f7331> in <module>() ----> 1 X = enc.fit_transform(a) ~\Anaconda3\lib\site-packages\sklearn\preprocessing\data.py in fit_transform(self, X, y) 2017 """ 2018 return _transform_selected(X, self._fit_transform, -> 2019 self.categorical_features, copy=True) 2020 2021 def _transform(self, X): ~\Anaconda3\lib\site-packages\sklearn\preprocessing\data.py in _transform_selected(X, transform, selected, copy) 1807 X : array or sparse matrix, shape=(n_samples, n_features_new) 1808 """ -> 1809 X = check_array(X, accept_sparse='csc', copy=copy, dtype=FLOAT_DTYPES) 1810 1811 if isinstance(selected, six.string_types) and selected == "all": ~\Anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator) 431 force_all_finite) 432 else: --> 433 array = np.array(array, dtype=dtype, order=order, copy=copy) 434 435 if ensure_2d: ValueError: setting an array element with a sequence.

1条回答

网友

1楼 · 发布于 2024-04-20 14:11:05

对于列表项，您应该在sklearn中使用MultiLabelBinarizer

from sklearn.preprocessing import MultiLabelBinarizer
mlb = MultiLabelBinarizer()
print (pd.DataFrame(mlb.fit_transform(df['A']),columns=mlb.classes_, index=df.index))
   1  2  3  4  5  7
0  0  0  0  0  1  1
1  0  0  1  1  1  0
2  0  1  0  0  0  0
3  1  1  1  1  0  0

相关问题更多 >

编程相关推荐

热门问题

热门文章