我一直在从youtube视频学习python。我是python新手,只是个初学者。我在视频中看到了这段代码,所以我尝试了一下,但得到了我不知道如何解决的错误。 下面是我遇到麻烦的代码。我没有写enitre代码,因为它太长了
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn import svm
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import confusion_matrix, classification_report
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.model_selection import train_test_split
%matplotlib inline
wine = pd.read_csv('wine_quality.csv')
wine.head()
wine.info()
wine.isnull().sum()
#Preprocessing
bins=(2,6.5,8)
group_names=['bad','good']
wine['quality'] = pd.cut(wine['quality'], bins=bins, labels=group_names)
wine['quality'].unique()
label_quality=LabelEncoder()
wine['quality']=label_quality.fit_transform(wine['quality'])
#after this im getting that error
'''TypeError Traceback (most recent call last)
~\anaconda3\lib\site-packages\sklearn\preprocessing\_label.py in _encode(values, uniques, encode, check_unknown)
112 try:
--> 113 res = _encode_python(values, uniques, encode)
114 except TypeError:
~\anaconda3\lib\site-packages\sklearn\preprocessing\_label.py in _encode_python(values, uniques, encode)
60 if uniques is None:
---> 61 uniques = sorted(set(values))
62 uniques = np.array(uniques, dtype=values.dtype)
TypeError: '<' not supported between instances of 'float' and 'str'
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<ipython-input-14-8e211b2c4bf8> in <module>
----> 1 wine['quality'] = label_quality.fit_transform(wine['quality'])
~\anaconda3\lib\site-packages\sklearn\preprocessing\_label.py in fit_transform(self, y)
254 """
255 y = column_or_1d(y, warn=True)
--> 256 self.classes_, y = _encode(y, encode=True)
257 return y
258
~\anaconda3\lib\site-packages\sklearn\preprocessing\_label.py in _encode(values, uniques, encode, check_unknown)
115 types = sorted(t.__qualname__
116 for t in set(type(v) for v in values))
--> 117 raise TypeError("Encoders require their input to be uniformly "
118 f"strings or numbers. Got {types}")
119 return res
TypeError: Encoders require their input to be uniformly strings or numbers. Got ['float', 'str']'''
```
请帮我纠正我的错误。如果你能确切地告诉我该怎么做,那就太好了
因此,我检查了葡萄酒质量数据集,并在执行以下操作时:
我得到了以下输出:
现在,由于我们的值超过了您在bins中为
pd.cut()
函数提供的上限,超出限制的值将替换为NaN值。我也在我的编译器上检查过,所以在执行预处理之后我得到的
wine['quality'].unique()
结果是:这是因为所有超过8的值(您提供的上限)都更改为NaN,这在
pd.cut()
函数的文档中也提到过:现在
wine['quality'].unique()
的输出是:因此,我们不再有NaN值,您的标签编码器现在应该可以正常工作了
相关问题 更多 >
编程相关推荐