When using R
caret
to compare multiple models on the same data set, caret
is smart enough to select different tuning ranges for different models if the same tuneLength
is specified for all models and no model-specific tuneGrid
is specified.
For example, the tuning ranges chosen by caret
for one particular data set are:
earth(nprune)
: 2, 5, 8, 11, 14
gamSpline(df)
: 1, 1.5, 2, 2.5, 3
rpart(cp)
: 0.010, 0.054, 0.116, 0.123, 0.358
How does caret
determine these default tuning ranges? I have been searching through the documentation but still haven't pinned down the algorithm to choose the ranges.
It depends on the model. For rpart
and a few others, it fits and initial model to get a sense of what reasonable values should be. In other cases, it is less intelligent. For example, for gamSpline
it is expand.grid(df = seq(1, 3, length = len))
.
You can see what it does per model using getModelInfo
:
> getModelInfo("earth")[[1]]$grid
function(x, y, len = NULL) {
dat <- if(is.data.frame(x)) x else as.data.frame(x)
dat$.outcome <- y
mod <- earth( .outcome~., data = dat, pmethod = "none")
maxTerms <- nrow(mod$dirs)
maxTerms <- min(200, floor(maxTerms * .75) + 2)
data.frame(nprune = unique(floor(seq(2, to = maxTerms, length = len))),
degree = 1)
}
Max