Hello I have the following ranger model:
X <- train_df[, -1]
y <- train_df$Price
rf_model <- ranger(Price ~ ., data = train_df, mtry = 11 ,splitrule = "extratrees" ,min.node.size = 1, num.trees =100)
I am trying to accomplish two things,
What I have tried:
**The following worked for optimizing for mtry,splitrule and min.node.size, but I can not add the number of trees into the equation, as it gives me an error in the case of doing so. ** # define the parameter grid to search over param_grid <- expand.grid(mtry = c(1:ncol(X)), splitrule = c( "variance", "extratrees", "maxstat"), min.node.size = c(1, 5, 10))
# set up the cross-validation scheme
cv_scheme <- trainControl(method = "cv",
number = 5,
verboseIter = TRUE)
# perform the grid search using caret
rf_model <- train(x = X,
y = y,
method = "ranger",
trControl = cv_scheme,
tuneGrid = param_grid)
# view the best parameter values
rf_model$bestTune
One easy way to do it, is to add a num.trees
argument in train
and iterate over that argument.
The other way is to create your customized model see this chapter Using Your Own Model
there is an RPubs paper by Pham Dinh Khanh demonstrating that here
library(caret)
library(mlbench)
library(ranger)
data(PimaIndiansDiabetes)
x=PimaIndiansDiabetes[,-ncol(PimaIndiansDiabetes)]
y=PimaIndiansDiabetes[,ncol(PimaIndiansDiabetes)]
param_grid=expand.grid(mtry = c(1:4),
splitrule = c( "variance", "extratrees"),
min.node.size = c(1, 5))
cv_scheme <- trainControl(method = "cv",
number = 5,
verboseIter = FALSE)
models=list()
for (ntree in c(4,100)){
set.seed(123)
rf_model <- train(x = x,
y = y,
method = "ranger",
trControl = cv_scheme,
tuneGrid = param_grid,
num.trees=ntree)
name=paste0(ntree,"_tr_model")
models[[name]]=rf_model
}
models[["4_tr_model"]]
#> Random Forest
#>
#> 768 samples
#> 8 predictor
#> 2 classes: 'neg', 'pos'
#>
#> No pre-processing
#> Resampling: Cross-Validated (5 fold)
#> Summary of sample sizes: 614, 615, 614, 615, 614
#> Resampling results across tuning parameters:
#>
#> mtry splitrule min.node.size Accuracy Kappa
#> 1 variance 1 NaN NaN
#> 1 variance 5 NaN NaN
#> 1 extratrees 1 0.6808675 0.2662428
#> 1 extratrees 5 0.6783125 0.2618862
...
models[["100_tr_model"]]
#> Random Forest
...
#>
#> mtry splitrule min.node.size Accuracy Kappa
#> 1 variance 1 NaN NaN
#> 1 variance 5 NaN NaN
#> 1 extratrees 1 0.7473559 0.3881530
#> 1 extratrees 5 0.7564808 0.4112127
...
Created on 2023-04-19 with reprex v2.0.2