python-3.xmachine-learningsgdskoptbayessearchcv

BayesSearchCV is not working during SGDClassifier parameter tuning


I am trying to use BayesSearchCV for the parameter tuning of the SGDClassifier. Below is my code which I tried.

import seaborn
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from skopt import BayesSearchCV
from sklearn.linear_model import SGDClassifier

df = seaborn.load_dataset("iris")
df_features = df.drop(['species'], axis=1)
df_target = df[['species']]

label_encoder = LabelEncoder()
df_target['species'] = list(label_encoder.fit_transform(df['species'].values.tolist()))

X_train, X_test, y_train, y_test = train_test_split(df_features, df_target, test_size=0.25, random_state=0)

model = SGDClassifier()

model_param = {
    'penalty': ['l2', 'l1', 'elasticnet'],
    'l1_ratio': [0, 0.05, 0.1, 0.2, 0.5, 0.8, 0.9, 0.95, 1],
    'loss': ['hinge', 'log', 'modified_huber', 'squared_hinge', 'perceptron', 'squared_loss', 'huber',
             'epsilon_insensitive', 'squared_epsilon_insensitive'],
    'alpha': [10 ** x for x in range(-6, 1)],
    'random_state': [0]
}

opt = BayesSearchCV(model, model_param, n_iter=32, cv=3)
opt.fit(X_train, y_train)
opt_pred_values = opt.predict(X_test)

The is creating below error:

ValueError: invalid literal for int() with base 10: '0.8'

I also tested GridSearchCV and RandomizedSearchCV with the same model_param list and those are working fine. How can I use BayesSearchCV without error? Where I have to change or which parameter I have to delete?

[Update]

if I remove 'l1_ratio' from the model_param then the above code will work. How can I execute keeping 'l1_ratio'?


Solution

  • After several combinations of parameters, I found that if I remove 'l1_ratio' then it is working. Then I tried 'l1_ratio' like the below:

    'l1_ratio': [0.0, 0.05, 0.1, 0.2, 0.5, 0.8, 0.9, 0.95, 1.0]
    'l1_ratio': [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.85, 0.9, 1]
    'l1_ratio': [10 ** x for x in range(-1, 1)]
    'l1_ratio': [float(x/10) for x in range(1, 10)]
    

    All are working. So finally I Changed 0 to 0.0 and 1 to 1.0 in the search space of 'l1_ratio'.

    I keep the solution here for the future. Maybe someone will be benefitted someday.