machine-learningnormalizationdata-preprocessingfeature-scaling

Feature Scaling with MinMaxScaler()


I have 31 features to be input into an ML algorithm. Of these 22 feature values are in the range of 0 to 1 already. The remaining 9 features vary between 0 to 750. My doubt is if I choose to apply MinMaxScaler() with range set to (0,1), should all the features be scaled or only the 9 features outside the desired range be subjected to scaling? What is more appropriate?


Solution

  • It is better to scale all variables.

    If you don't scale those 22 features, the scale of the features will be still different.

    For example, those 22 features maybe between 0.2 to 0.7, but other features will be between 0 and 1 (the minimum will change to 0 and the maximum value will change to 1).

    Then, when doing the math, although 0.2 is the minimum of a feature, it is not zeros which can make the learning difficult.