I have previously used mlr3
for imbalanced classification problems, and used PipeOpClassWeights
to apply class weights to learners during training. This pipe op adds a column of observation weights to the Task
, in the Task$weights
property. These observation weights are then passed to the Learner
during training.
Some classification performance metrics, such as Brier score (classif.bbrier
) and log loss (classif.logloss
) can be calculated with class weights applied (this is what happens to the training log loss when we train with class weights).
My question is, when we perform validation with resample
, and aggregate performance metrics from the results, in the form of
resampling = rsmp("cv", folds = 4)
measure = msr("classif.bbrier")
result = resample(task, learner, resampling)
score = resampling$aggregate(measure)
are the class weights applied to this validation score calculation too (if applicable to the particular measure)? Is this also done when we perform hyperparameter tuning, for example with AutoTuner
?
I looked in the documentation for the aforementioned classes, and the resampling section of mlr3book, but couldn't find an answer. I assume we'd want the same class weights applied to the training loss to be applied to the validation loss, at least for hyperparameter tuning if not for performance testing.
I was inspired to investigate this after coming across a similar issue in the validation scores of xgboost
's Python implementation, discussed here.
Yes, they are. If they're not, that's a bug.