pythonnumpymultidimensional-arrayscikit-learnnormalize

MinmaxScaler: Normalise a 4D array of input


I have a 4D array of input that I would like to normalise using MinMaxScaler. For simplicity, I give an example with the following array:

A = np.array([
            [[[0, 1, 2, 3],
              [3, 0, 1, 2],
              [2, 3, 0, 1],
              [1, 3, 2, 1],
              [1, 2, 3, 0]]],
            
            [[[9, 8, 7, 6],
              [5, 4, 3, 2],
              [0, 9, 8, 3],
              [1, 9, 2, 3],
              [1, 0, -1, 2]]],
            
            [[[0, 7, 1, 2],
              [1, 2, 1, 0],
              [0, 2, 0, 7],
              [-1, 3, 0, 1],
              [1, 0, 1, 0]]]
              ])
A.shape
(3,1,5,4)

In the given example, the array contains 3 input samples, where each sample has the shape (1,5,4). Each column of the input represents 1 variable (feature), so each sample has 4 features.

I would like to normalise the input data, But MinMaxScaler expects a 2D array (n_samples, n_features) like dataframe.

How then do I use it to normalise this input data?


Solution

  • Vectorize the data

    from sklearn.preprocessing import MinMaxScaler
    
    scaler = MinMaxScaler()
    
    A_sq = np.squeeze(A)
    
    print(A_sq.shape)
    # (3, 5, 4)
    
    scaler.fit(np.squeeze(A_sq).reshape(3,-1)) # reshape to (3, 20)
    #MinMaxScaler()