I tried to find the best combination of hyperparameters for LogisticRegression
in sklearn
. Below is the example of my code:
pipeline = Pipeline([("scaler", StandardScaler()),
("smt", SMOTE(random_state=42)),
("logreg", LogisticRegression())])
parameters = [{'logreg__solver': ['saga']},
{'logreg__penalty':['l1', 'l2']},
{'logreg__C':[1e-3, 0.1, 1, 10, 100]}]
grid_pipeline = GridSearchCV(pipeline,
parameters,
scoring= 'f1',
n_jobs=5, verbose=5,
return_train_score=True,
cv=5)
grid_result = grid_pipeline.fit(X_train,y_train)
During fitting I get the following error:
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.
For some reason, default value 'lbfgs' is used for solver
parameter instead of chosen 'saga'. Why does it happen?
I think the issue is how you have specified parameters
. To get the desired behaviour, use a single dict
as follows:
parameters = {'logreg__solver': ['saga'],
'logreg__penalty':['l1', 'l2'],
'logreg__C':[1e-3, 0.1, 1, 10, 100]
}
You had specified it as a list of dicts, which gave GridSearchCV
the option of picking some and ignoring others, meaning it sometimes encountered the request to use l1
on the default (non-saga
) solver. Those two options are not compatible.