There's a couple of other questions similar to this, but I couldn't find a solution which seems to fit. I am using LightGBM with Scikit-Optimize BayesSearchCV.
full_pipeline = skl.Pipeline(steps=[('preprocessor', pre_processor),
('estimator', lgbm.sklearn.LGBMClassifier())])
scorer=make_scorer(fl.lgb_focal_f1_score)
lgb_tuner = sko.BayesSearchCV(full_pipeline, hyper_space, cv=5, refit=True, n_iter=num_calls,scoring=scorer)
lgb_tuner.fit(balanced_xtrain, balanced_ytrain)
Training runs for a while, before it errors with the following:
Traceback (most recent call last):
File "/var/training.py", line 134, in <module>
lgb_tuner.fit(balanced_xtrain, balanced_ytrain)
File "/usr/local/lib/python3.6/site-packages/skopt/searchcv.py", line 694, in fit
groups=groups, n_points=n_points_adjusted
File "/usr/local/lib/python3.6/site-packages/skopt/searchcv.py", line 579, in _step
self._fit(X, y, groups, params_dict)
File "/usr/local/lib/python3.6/site-packages/skopt/searchcv.py", line 423, in _fit
for parameters in parameter_iterable
File "/usr/local/lib/python3.6/site-packages/joblib/parallel.py", line 1041, in __call__
if self.dispatch_one_batch(iterator):
File "/usr/local/lib/python3.6/site-packages/joblib/parallel.py", line 859, in dispatch_one_batch
self._dispatch(tasks)
File "/usr/local/lib/python3.6/site-packages/joblib/parallel.py", line 777, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "/usr/local/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
result = ImmediateResult(func)
File "/usr/local/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 572, in __init__
self.results = batch()
File "/usr/local/lib/python3.6/site-packages/joblib/parallel.py", line 263, in __call__
for func, args, kwargs in self.items]
File "/usr/local/lib/python3.6/site-packages/joblib/parallel.py", line 263, in <listcomp>
for func, args, kwargs in self.items]
File "/usr/local/lib/python3.6/site-packages/sklearn/model_selection/_validation.py", line 531, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)
File "/usr/local/lib/python3.6/site-packages/sklearn/pipeline.py", line 335, in fit
self._final_estimator.fit(Xt, y, **fit_params_last_step)
File "/usr/local/lib/python3.6/site-packages/lightgbm/sklearn.py", line 857, in fit
callbacks=callbacks, init_model=init_model)
File "/usr/local/lib/python3.6/site-packages/lightgbm/sklearn.py", line 617, in fit
callbacks=callbacks, init_model=init_model)
File "/usr/local/lib/python3.6/site-packages/lightgbm/engine.py", line 252, in train
booster.update(fobj=fobj)
File "/usr/local/lib/python3.6/site-packages/lightgbm/basic.py", line 2467, in update
return self.__boost(grad, hess)
File "/usr/local/lib/python3.6/site-packages/lightgbm/basic.py", line 2503, in __boost
ctypes.byref(is_finished)))
File "/usr/local/lib/python3.6/site-packages/lightgbm/basic.py", line 55, in _safe_call
raise LightGBMError(decode_string(_LIB.LGBM_GetLastError()))
lightgbm.basic.LightGBMError: Check failed: (best_split_info.left_count) > (0) at /__w/1/s/python-package/compile/src/treelearner/serial_tree_learner.cpp, line 651 .
Some answers to similar questions have suggested it might be as a consequence of using the GPU, but I do not have a GPU available. I don't know what else is causing it or how to try and fix it. Can anyone suggest anything?
I think this was due to my hyperparameter limits being wrong, causing one hyperparameter to be set to zero which shouldn't have been, though I'm not sure which one.