标准定标器与MinMax的区别

2024-04-25 08:43:22 发布

您现在位置:Python中文网/ 问答频道 /正文

MinMaxScaler和标准scaler有什么区别。

MMS= MinMaxScaler(feature_range = (0, 1))(用于程序1)

sc = StandardScaler()(在另一个程序中,它们使用标准scaler而不是minMaxScaler)


Tags: 程序标准rangefeaturesc区别scalermms
2条回答

来自ScikitLearn site

StandardScaler removes the mean and scales the data to unit variance. However, the outliers have an influence when computing the empirical mean and standard deviation which shrink the range of the feature values as shown in the left figure below. Note in particular that because the outliers on each feature have different magnitudes, the spread of the transformed data on each feature is very different: most of the data lie in the [-2, 4] range for the transformed median income feature while the same data is squeezed in the smaller [-0.2, 0.2] range for the transformed number of households.

StandardScaler therefore cannot guarantee balanced feature scales in the presence of outliers.

MinMaxScaler rescales the data set such that all feature values are in the range [0, 1] as shown in the right panel below. However, this scaling compress all inliers in the narrow range [0, 0.005] for the transformed number of households.

MinMaxScaler(feature_range = (0, 1))将在[0,1]范围内按比例转换列中的每个值。将此作为转换特征的第一个缩放选项,因为它将保留数据集的形状(无失真)。

StandardScaler()将把列中的每个值转换为关于平均值0和标准偏差1的范围,即,通过减去平均值并除以标准偏差,将每个值正规化。如果知道数据分布正常,请使用标准缩放器。

如果有异常值,请使用RobustScaler()。或者,您可以删除异常值并使用上述两个定标器中的任何一个(选择取决于数据是否正态分布)

附加说明:如果在列车试验分离前使用定标器,则会发生数据泄漏。列车解体后必须使用定标器

相关问题 更多 >