I am migrating from XGBoost to LightGBM (since I need it's exact handling of interaction constraints) and I am struggling to understand the result of LightGBM CV. In the example below, the minimum log-loss is achieved on iteration 125, but model['cvbooster'].best_iteration
returns -1. I would have expected it to return 125 as well - or am I misunderstanding something here? Is there a better way to get the best iteration, or does one just need to manually check?
I have seen this discussion but even when I check the boosters
in cvbooster
(e.g., model['cvbooster'].boosters[0].best_iteration
), they all return -1 as well...
import lightgbm as lgb
import numpy as np
from sklearn import datasets
X, y = datasets.make_classification(n_samples=10_000, n_features=5, n_informative=3, random_state=9)
data_train_lgb = lgb.Dataset(X, label=y)
param = {'objective': 'binary',
'metric': ['binary_logloss'],
'device_type': 'cuda'}
model = lgb.cv(param,
data_train_lgb,
num_boost_round=1_000,
return_cvbooster=True)
opt_1 = np.argmin(model['valid binary_logloss-mean'])
print(f"index argmin: {opt_1}")
print(f"logloss argmin: {model['valid binary_logloss-mean'][opt_1]}")
opt_2 = model['cvbooster'].best_iteration
print(f"index best_iteration: {opt_2}")
print(f"logloss best_iteration: {model['valid binary_logloss-mean'][opt_2]}")
---
>>> index argmin: 125
>>> logloss argmin: 0.13245999867688793
>>> index best_iteration: -1
>>> logloss best_iteration: 0.2661896445658779
In lightgbm
(the Python package for LightGBM), best_iteration
isn't the iteration where the model achieved the best performance on evaluation metrics... it's the last iteration (1-based) where performance on evaluation metrics improved, if early stopping is used.
See this example (using lightgbm==4.5.0
, scikit-learn==1.6.0
, and Python 3.11).
import lightgbm as lgb
import numpy as np
from sklearn import datasets
X, y = datasets.make_classification(
n_samples=10_000,
n_features=5,
n_informative=3,
random_state=9
)
params = {
"deterministic": True,
"objective": "binary",
"metric": "binary_logloss",
"seed": 708
}
# train without early stopping
model = lgb.cv(
params=params,
train_set=lgb.Dataset(X, label=y),
num_boost_round=200,
return_cvbooster=True
)
model['cvbooster'].best_iteration
# -1
opt_1 = np.argmin(model['valid binary_logloss-mean'])
print(f"index argmin: {opt_1}")
# index argmin: 114
print(f"logloss argmin: {model['valid binary_logloss-mean'][opt_1]:.6f}")
logloss argmin: 0.132579
# train WITH early stopping
model = lgb.cv(
params={**params, "early_stopping_rounds": 5},
train_set=lgb.Dataset(X, label=y),
num_boost_round=200,
return_cvbooster=True
)
model['cvbooster'].best_iteration
# 115
opt_1 = np.argmin(model['valid binary_logloss-mean'])
print(f"index argmin: {opt_1}")
# index argmin: 114
print(f"logloss argmin: {model['valid binary_logloss-mean'][opt_1]:.6f}")
# logloss argmin: 0.132579
Notes on that:
"deterministic": True
and setting "seed"
to a positive value helps make training deterministiccv()
can be enabled by passing a positive value for "early_stopping_rounds"
through params