pythonscikit-learnhyperparameters

Conditional tuning of hyperparameters with RandomizedSearchCV in scikit-learn


I want to use RandomizedSearchCV in sklearn to search for the optimal hyperparameter values for a support vector classifier on my dataset. The hyperparameters I am optimising are "kernel", "C" and "gamma". However, in the case of a "poly" kernel, I would also like to optimise a fourth hyperparameter, "degree" (the index of the polynomial kernel function).

I realise that since the degree hyperparameter is ignored when the kernel is not "poly", I could just include degree in the params dictionary I provide to RandomizedSearchCV (as I've done in the code below). However, ideally I would like to search uniformely across the non-poly kernels plus each degree of poly kernel, i.e. I want to sample uniformly across e.g. [(kernel="linear"), (kernel="rbf"), (kernel="poly", degree=2), (kernel="poly", degree=3)]. Therefore, I was wondering if it is possible to conditionally introduce a hyperparameter for tuning, i.e. if kernel="poly" degree=np.linspace(2, 5, 4), else degree=0.

I haven't been able to find an example of this in the RandomizedSearchCV documentation, and so was wondering if anybody here had come across the same issue and would be able to help. Thanks!

from sklearn.svm import SVC
from sklearn.model_selection import RandomizedSearchCV
from sklearn.model_selection import StratifiedKFold

clf = SVC()
params = {'kernel': ['linear', 'poly', 'rbf', 'sigmoid'],
          'degree': np.linspace(2, 5, 4),
          'C': np.logspace(-3, 5, 17),
          'gamma': np.logspace(-3, 5, 17)}

random_search = RandomizedSearchCV(
    estimator=clf, param_distributions=params, n_iter=200, n_jobs=-1,
    cv=StratifiedKFold(n_splits=5), iid=False
)

Solution

  • Unfortunately, GridsearchCV and RandomizedSearchCV don't support conditional tuning of hyperparameters.

    Hyperopt supports conditional tuning of hyperparameters, check this wiki for more details.

    Example:

    space4svm = {
        'C': hp.uniform('C', 0, 20),
        'kernel': hp.choice('kernel', [
                {'ktype': 'linear'},
                {'ktype': 'poly', 'degree': hp.lognormal('degree', 0, 1)},
                ]),
        'gamma': hp.uniform('gamma', 0, 20),
        'scale': hp.choice('scale', [0, 1]),
        'normalize': hp.choice('normalize', [0, 1])
    }