rrandom-foresttidymodelsr-ranger

Can the out of bag error for a random forests model in R's TidyModel's framework be obtained?


If you directly use the ranger function, one can obtain the out-of-bag error from the resulting ranger class object.

If instead, one proceeds by way of setting up a recipe, model specification/engine, with tuning parameters, etc., how can we extract that same error? The Tidymodels approach doesn't seem to hold on to that data.


Solution

  • If you want to access the ranger object inside of the parsnip object, it is there as $fit:

    library(tidymodels)
    
    data("ad_data", package = "modeldata")
    
    rf_spec <- 
      rand_forest() %>% 
      set_engine("ranger", oob.error = TRUE) %>% 
      set_mode("classification")
    
    rf_fit <- rf_spec %>%
      fit(Class ~ ., data = ad_data)
    
    rf_fit
    #> parsnip model object
    #> 
    #> Fit time:  158ms 
    #> Ranger result
    #> 
    #> Call:
    #>  ranger::ranger(x = maybe_data_frame(x), y = y, oob.error = ~TRUE,      num.threads = 1, verbose = FALSE, seed = sample.int(10^5,          1), probability = TRUE) 
    #> 
    #> Type:                             Probability estimation 
    #> Number of trees:                  500 
    #> Sample size:                      333 
    #> Number of independent variables:  130 
    #> Mtry:                             11 
    #> Target node size:                 10 
    #> Variable importance mode:         none 
    #> Splitrule:                        gini 
    #> OOB prediction error (Brier s.):  0.1340793
    
    class(rf_fit)
    #> [1] "_ranger"   "model_fit"
    class(rf_fit$fit)
    #> [1] "ranger"
    
    rf_fit$fit$prediction.error
    #> [1] 0.1340793
    

    Created on 2021-03-11 by the reprex package (v1.0.0)