[SOLVED] Strange behaviour of GridSearchCV with hidden_layer

Strange behaviour of GridSearchCV with hidden_layer_sizes

GridSearchCV (no matter if from sklearn or from dask) seems to to something strange or wrong with the parameters which leads to the MLPRegressor to ignore the parameter.
I show the behaviour in terms of a minimal working example.
Assume numerical initialized features and values in my case

print(features.shape)
print(values.shape)
(321278, 36)
(321278,)

and running the following code

from dask_ml.model_selection import GridSearchCV as daskGridSearchCV
from sklearn.model_selection import GridSearchCV as skGridSearchCV
from sklearn.neural_network import MLPRegressor
myparams = {'hidden_layer_sizes': [(2, ), (4, )]}
daskgridCV = daskGridSearchCV(estimator=MLPRegressor(), n_jobs=-1, param_grid=myparams)
daskbestfit = daskgridCV.fit(features, values)
skgridCV = skGridSearchCV(estimator=MLPRegressor(), n_jobs=-1, param_grid=myparams,cv=3)
skbestfit = skgridCV.fit(features, values)
display(daskbestfit)
display(skbestfit)

results in

GridSearchCV(cache_cv=True, cv=None, error_score='raise',
             estimator=MLPRegressor(activation='relu', alpha=0.0001,
                                    batch_size='auto', beta_1=0.9, beta_2=0.999,
                                    early_stopping=False, epsilon=1e-08,
                                    hidden_layer_sizes=(100,),
                                    learning_rate='constant',
                                    learning_rate_init=0.001, max_iter=200,
                                    momentum=0.9, n_iter_no_change=10,
                                    nesterovs_momentum=True, power_t=0.5,
                                    random_state=None, shuffle=True,
                                    solver='adam', tol=0.0001,
                                    validation_fraction=0.1, verbose=False,
                                    warm_start=False),
             iid=True, n_jobs=-1,
             param_grid={'hidden_layer_sizes': [(2,), (4,)]}, refit=True,
             return_train_score=False, scheduler=None, scoring=None)
GridSearchCV(cv=3, error_score='raise-deprecating',
             estimator=MLPRegressor(activation='relu', alpha=0.0001,
                                    batch_size='auto', beta_1=0.9, beta_2=0.999,
                                    early_stopping=False, epsilon=1e-08,
                                    hidden_layer_sizes=(100,),
                                    learning_rate='constant',
                                    learning_rate_init=0.001, max_iter=200,
                                    momentum=0.9, n_iter_no_change=10,
                                    nesterovs_momentum=True, power_t=0.5,
                                    random_state=None, shuffle=True,
                                    solver='adam', tol=0.0001,
                                    validation_fraction=0.1, verbose=False,
                                    warm_start=False),
             iid='warn', n_jobs=-1,
             param_grid={'hidden_layer_sizes': [(2,), (4,)]},
             pre_dispatch='2*n_jobs', refit=True, return_train_score=False,
             scoring=None, verbose=0)

thus in both cases the hidden_layer_sizes parameter has the value (100,) which is not in the grid. Am I doing something wrong, or what is happening here?

python-Version 3.6.9
sklearn-Version 0.21.2
dask_ml-Version 1.0.0

Solution

This is absolutely normal. estimator=MLPRegressor() creates an instance of MLPRegressor with it's default values, when initializing GridSearchCV ((100,) is the default value of hidden_layer_sizes parameter.)

By fitting GridSearchCV with data, it will itterate through each possible combination of hyperparameters from myparams for each Fold and picks the best one. You can check the results of cross validation by accessing skgridCV.cv_results_ .