pythonscikit-learnnormalization

Difference between Normalizer and MinMaxScaler


I'm trying to understand the effects of applying the Normalizer or applying MinMaxScaler or applying both in my data. I've read the docs in SKlearn, and saw some examples of use. I understand that MinMaxScaler is important (is important to scale the features), but what about Normalizer?

It keeps unclear to me the practical result of using the Normamlizer in my data.

MinMaxScaler is applied column-wise, Normalizer is apllied row-wise. What does it implies? Should I use the Normalizer or just use the MinMaxScale or should use then both?


Solution

  • As you have said,

    MinMaxScaler is applied column-wise, Normalizer is applied row-wise.

    Do not confuse Normalizer with MinMaxScaler. The Normalizer class from Sklearn normalizes samples individually to unit norm. It is not column based but a row-based normalization technique. In other words, the range will be determined either by rows or columns.

    So, remember that we scale features not records, because we want features to have the same scale, so the model to be trained will not give different weights to different features based on their range. If we scale the records, this will give each record its own scale, which is not what we need.

    So, if features are represented by rows, then you should use the Normalizer. But in most cases, features are represented by columns, so you should use one of the scalers from Sklearn depending on the case: