pythonmachine-learninglightgbmscikit-optimize

lightgbm.basic.LightGBMError: Check failed: (best_split_info.left_count) > (0)


There's a couple of other questions similar to this, but I couldn't find a solution which seems to fit. I am using LightGBM with Scikit-Optimize BayesSearchCV.

full_pipeline = skl.Pipeline(steps=[('preprocessor', pre_processor), 
                                        ('estimator',    lgbm.sklearn.LGBMClassifier())])
scorer=make_scorer(fl.lgb_focal_f1_score)
lgb_tuner = sko.BayesSearchCV(full_pipeline, hyper_space, cv=5, refit=True, n_iter=num_calls,scoring=scorer)
lgb_tuner.fit(balanced_xtrain, balanced_ytrain)

Training runs for a while, before it errors with the following:

Traceback (most recent call last):
  File "/var/training.py", line 134, in <module>
    lgb_tuner.fit(balanced_xtrain, balanced_ytrain)
  File "/usr/local/lib/python3.6/site-packages/skopt/searchcv.py", line 694, in fit
    groups=groups, n_points=n_points_adjusted
  File "/usr/local/lib/python3.6/site-packages/skopt/searchcv.py", line 579, in _step
    self._fit(X, y, groups, params_dict)
  File "/usr/local/lib/python3.6/site-packages/skopt/searchcv.py", line 423, in _fit
    for parameters in parameter_iterable
  File "/usr/local/lib/python3.6/site-packages/joblib/parallel.py", line 1041, in __call__
    if self.dispatch_one_batch(iterator):
  File "/usr/local/lib/python3.6/site-packages/joblib/parallel.py", line 859, in dispatch_one_batch
    self._dispatch(tasks)
  File "/usr/local/lib/python3.6/site-packages/joblib/parallel.py", line 777, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/usr/local/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/usr/local/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 572, in __init__
    self.results = batch()
  File "/usr/local/lib/python3.6/site-packages/joblib/parallel.py", line 263, in __call__
    for func, args, kwargs in self.items]
  File "/usr/local/lib/python3.6/site-packages/joblib/parallel.py", line 263, in <listcomp>
    for func, args, kwargs in self.items]
  File "/usr/local/lib/python3.6/site-packages/sklearn/model_selection/_validation.py", line 531, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "/usr/local/lib/python3.6/site-packages/sklearn/pipeline.py", line 335, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "/usr/local/lib/python3.6/site-packages/lightgbm/sklearn.py", line 857, in fit
    callbacks=callbacks, init_model=init_model)
  File "/usr/local/lib/python3.6/site-packages/lightgbm/sklearn.py", line 617, in fit
    callbacks=callbacks, init_model=init_model)
  File "/usr/local/lib/python3.6/site-packages/lightgbm/engine.py", line 252, in train
    booster.update(fobj=fobj)
  File "/usr/local/lib/python3.6/site-packages/lightgbm/basic.py", line 2467, in update
    return self.__boost(grad, hess)
  File "/usr/local/lib/python3.6/site-packages/lightgbm/basic.py", line 2503, in __boost
    ctypes.byref(is_finished)))
  File "/usr/local/lib/python3.6/site-packages/lightgbm/basic.py", line 55, in _safe_call
    raise LightGBMError(decode_string(_LIB.LGBM_GetLastError()))
lightgbm.basic.LightGBMError: Check failed: (best_split_info.left_count) > (0) at /__w/1/s/python-package/compile/src/treelearner/serial_tree_learner.cpp, line 651 .

Some answers to similar questions have suggested it might be as a consequence of using the GPU, but I do not have a GPU available. I don't know what else is causing it or how to try and fix it. Can anyone suggest anything?


Solution

  • I think this was due to my hyperparameter limits being wrong, causing one hyperparameter to be set to zero which shouldn't have been, though I'm not sure which one.