标准定标器与MinMax的区别

2条回答

网友

1楼 · 编辑于 2024-04-25 08:43:22

StandardScaler removes the mean and scales the data to unit variance. However, the outliers have an influence when computing the empirical mean and standard deviation which shrink the range of the feature values as shown in the left figure below. Note in particular that because the outliers on each feature have different magnitudes, the spread of the transformed data on each feature is very different: most of the data lie in the [-2, 4] range for the transformed median income feature while the same data is squeezed in the smaller [-0.2, 0.2] range for the transformed number of households.
StandardScaler therefore cannot guarantee balanced feature scales in the presence of outliers.
MinMaxScaler rescales the data set such that all feature values are in the range [0, 1] as shown in the right panel below. However, this scaling compress all inliers in the narrow range [0, 0.005] for the transformed number of households.

网友

2楼 · 编辑于 2024-04-25 08:43:22

MinMaxScaler(feature_range = (0, 1))将在[0,1]范围内按比例转换列中的每个值。将此作为转换特征的第一个缩放选项，因为它将保留数据集的形状（无失真）。

StandardScaler()将把列中的每个值转换为关于平均值0和标准偏差1的范围，即，通过减去平均值并除以标准偏差，将每个值正规化。如果知道数据分布正常，请使用标准缩放器。

如果有异常值，请使用RobustScaler()。或者，您可以删除异常值并使用上述两个定标器中的任何一个（选择取决于数据是否正态分布）

附加说明：如果在列车试验分离前使用定标器，则会发生数据泄漏。列车解体后必须使用定标器

相关问题更多 >

编程相关推荐

热门问题

热门文章

标准定标器与MinMax的区别

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >