I have just made the switch from caret
to mlr
for a specific problem I am working on at the moment.
I am wondering if anyone here is familiar with specifying custom performance measures within the resample()
function.
Here's a reproducible code example:
library(mlr)
library(mlbench)
data(BostonHousing, package = "mlbench")
task_reg1 <- makeRegrTask(id = "bh", data = BostonHousing, target = "medv")
lrn_reg1 <- makeLearner(cl = "regr.randomForest",
predict.type = "response",
mtry=3)
cv_reg1 <- makeResampleDesc("RepCV", folds = 5, reps = 5)
regr_1 <- resample(learner = lrn_reg1,
task = task_reg1,
resampling = cv_reg1,
measures = mlr::rmse)
Instead of computing RMSE, I want to compute the Mean Absolute Scaled Error, MASE. A function for this may, for instance, be found in the Metrics
package: Metrics::mase()
.
I tried to include measures = Metrics::mase
directly in the resample()
call, but that was, as expected, a bit optimistic and I received the following error:
Error in checkMeasures(measures, task) :
Assertion on 'measures' failed: Must be of type 'list', not 'closure'.
I found out there's a function in the mlr
package for creating custom performance metrics, called makeMeasure()
(https://rdrr.io/cran/mlr/man/makeMeasure.html). I tried experimenting a bit with it, but did not manage to make anything work. I do not have much experience in tinkering with custom made functions, so I was hoping that someone here could help me out, or provide some resources for stuff like this.
Cheers!
You need to construct a function that can be applied within makeMeasure()
that is of the form function(task, model, pred, extra.args)
. We can just write a wrapper around Metrics::mase()
so you can use this function in resample()
, and you can do the same for any other metric you find.
mase_fun <- function(task, model, pred, feats, extra.args) {
Metrics::mase(pred$data$truth, pred$data$response, step_size = extra.args$step_size)
}
mase_measure <- makeMeasure(id = "mase",
minimize = T,
properties = c("regr", "req.pred", "req.truth"),
fun = mase_fun,
extra.args = list(step_size = 1))
resample(learner = lrn_reg1,
task = task_reg1,
resampling = cv_reg1,
measures = mase_measure)