Question: How do I define the kernel of a Gaussian Process Regressor using BayesSearchCV?
I'm trying to optimize hyperparameters in a gaussian process model using BayesSearchCV
from skopt
. It seems that I'm defining the kernel wrong and get a 'TypeError':
TypeError: Cannot clone object ''rbf'' (type <class 'str'>): it does not seem to be a scikit-learn estimator as it does not implement a 'get_params' method.
Dummy-Code:
from sklearn.datasets import make_regression
from sklearn.gaussian_process import GaussianProcessRegressor
from skopt import BayesSearchCV
from skopt.space import Real, Categorical, Integer
from sklearn.gaussian_process.kernels import RBF, DotProduct, Matern
X,y = make_regression(100,10)
estimator = GaussianProcessRegressor()
param = {
'kernel': ['rbf','matern'],
'n_restarts_optimizer': (5,10),
'alpha': (1e-5, 1e-2,'log-uniform')
}
opt = BayesSearchCV(
estimator=estimator,
search_spaces=param,
cv=3,
scoring="r2",
random_state=42,
n_iter=3,
verbose=1,
)
opt.fit(X, y)
First, GPR does not seem to support string aliased kernels, at least that holds for the current release.
That raises another issue however, if you supply the kernel
parameter with a constructor list, skopt is unable to process it (unhashable type). This is still a standing issue as far as I'm aware, though there's a proposed workaround at the bottom of the issue page.
Another possible workaround is constructing different base estimators with a specific kernel:
from sklearn.datasets import make_regression
from sklearn.gaussian_process import GaussianProcessRegressor
from skopt import BayesSearchCV
from skopt.space import Real, Categorical, Integer
from sklearn.gaussian_process.kernels import RBF, DotProduct, Matern
from sklearn.pipeline import Pipeline
X,y = make_regression(100,10)
estimator_list = [GaussianProcessRegressor(kernel=RBF()),
GaussianProcessRegressor(kernel=Matern())]
pipe=Pipeline([('estimator',GaussianProcessRegressor())])
param = {
'estimator': Categorical(estimator_list),
'estimator__n_restarts_optimizer': (5,10),
'estimator__alpha': (1e-5, 1e-2,'log-uniform')
}
opt = BayesSearchCV(
estimator=pipe,
search_spaces=param,
cv=3,
scoring="r2",
random_state=42,
n_iter=3,
verbose=1,
)
opt.fit(X, y)