我的因变量列中有一个具有多个NaN值的数据集。我已将该集拆分为因变量和自变量,目前正尝试将因变量列中的所有NaN值替换为0。但是,我在使用SimpleImputer进行此操作时遇到了一个错误
这是我的密码:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
"""
----- Read Dataset and Split into Dependent and Independent Variables: -----
"""
dataset = pd.read_csv('Salary_Data.csv')
x = dataset.iloc[:, 1:-1].values
y = dataset.iloc[:, -1].values
print("\nIndependent Variables: \n%s" % x)
print("\nDependent Variables: \n%s" % y)
"""
----- Fill in Missing Values: -----
"""
from sklearn.impute import SimpleImputer
imputer = SimpleImputer(missing_values = np.nan, strategy = 'constant', fill_value = 0)
y = imputer.fit(y)
print("\nDependent Variables After Missing Values Adjusted: \n%s" % y)
下面是我得到的错误:
Expected 2D array, got 1D array instead:
array=[270000. 200000. 250000. nan 425000. nan nan 252000. 231000.
nan 260000. 250000. nan 218000. nan 200000. 300000. nan
您可以使用
pandas.DataFrame.fillna
函数在所有位置直接填充0相关问题 更多 >
编程相关推荐