python python-3.x scikit-learn svm gridsearchcv

target scaling using GridSearchCV

For hyperparameter tuning, I use the function GridSearchCV from the Python package sklearn. Some of the models that I test require feature scaling (e.g. Support Vector Regression - SVR). Recently, in the Udemy course Machine Learning A-Z™: Hands-On Python & R In Data Science, the instructors mentioned that for SVR, the target should also be scaled (if it is not binary). Bearing this in mind, I wonder whether the target is also scaled in each iteration of the cross-validation procedure performed by GridSearchCV or if only the features are scaled. Please see the code below, which illustrates the normal procedure that I use for hyperparameter tuning for estimators that require scaling for the training sets:

from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVR
    
def SVRegressor(**kwargs):
   '''contruct a pipeline to perform SVR regression'''
   return make_pipeline(StandardScaler(), SVR(**kwargs))

params = {'svr__kernel': ["poly", "rbf"]}
grid_search = GridSearchCV(SVRegressor(), params)
grid_search.fit(X, y)

I know that I could simply scale X and y a priori and drop the StandardScaler from the pipeline. However, I want to implement this approach in a code pipeline where multiple models are tested, in which, some require scaling and others do not. That is why I want to know how GridSearchCV handles scaling under the hood.

Solution

No it doesn't scale the target, if you look at make_pipeline, it simply passes the X and y argument into your transformer, and StandardScaler() does nothing to your y:

def _fit_transform_one(transformer,
                       X,
                       y,
                       weight,
                       message_clsname='',
                       message=None,
                       **fit_params):
    """
    Fits ``transformer`` to ``X`` and ``y``. The transformed result is returned
    with the fitted transformer. If ``weight`` is not ``None``, the result will
    be multiplied by ``weight``.
    """
    with _print_elapsed_time(message_clsname, message):
        if hasattr(transformer, 'fit_transform'):
            res = transformer.fit_transform(X, y, **fit_params)
        else:
            res = transformer.fit(X, y, **fit_params).transform(X)

    if weight is None:
        return res, transformer
    return res * weight, transformer

You can try this on StandardScaler() and you can see it does not do anything with y:

np.random.seed(111)
X = np.random.normal(5,2,(100,3))
y = np.random.normal(5,2,100)

res = StandardScaler().fit_transform(X=X,y=y)
res.shape
(100, 3)

res.mean(axis=0)
array([1.01030295e-15, 4.39648318e-16, 8.91509089e-16])

res.std(axis=0)
array([1., 1., 1.])

You can also check the result of your gridsearchcv:

SVRegressor = make_pipeline(StandardScaler(), SVR())
params = {'svr__kernel': ["poly", "rbf"]}
grid_search = GridSearchCV(SVRegressor, params,
scoring='neg_mean_absolute_error')

On unscaled y, you will see that on the unscaled data, your negative mean absolute error is around the same scale as your standard deviation (I used 2 in my example):

grid_search.fit(X, y)

grid_search.cv_results_['mean_test_score']
array([-2.01029707, -1.88779205])

On scaled y, our standard deviation would be 1, and you can see the error is around -1,:

y_scaled = StandardScaler().fit_transform(y.reshape(-1,1)).ravel()
grid_search.fit(X, y_scaled)

grid_search.cv_results_['mean_test_score']
array([-1.00585999, -0.88330208])