
How to find the number of samples that are picked in each boostrap of stratified bootstrap in pROC?

Question is regarding the roc function of pROC package. Package link: Paper link

I am plotting confidence intervals on my ROC plot:

                   df$score ,
                   ci = TRUE,
                   ci.method= 'bootstrap',

I understand from documentation that when using the ci.method as 'bootstrap', stratified bootstrapping takes place. How to find the subsampling percentage used while bootstrapping? Is it 80% of total data, 70% or something else? Can we specify it?

The paper quotes, "Bootstrap is stratified by default; in this case the same number of case and control observations than in the original sample will be selected in each bootstrap replicate.". I think they meant same proportion of case and control observations in each replicate. However, what percentage of subsampling takes place is not mentioned anywhere.

If I am interpretting it wrong please correct me.


  • Bootstrapping by definition is resampling of the whole data with replacement.

    If ci is TRUE and ci.method is "bootstrap", then the roc function eventually calls, which looks like this:

    function (n, roc) 
        controls <- sample(roc$controls, replace = TRUE)
        cases <- sample(roc$cases, replace = TRUE)
        thresholds <- roc_utils_thresholds(c(cases, controls), roc$direction)
        perfs <- roc$fun.sesp(thresholds = thresholds, controls = controls, 
            cases = cases, direction = roc$direction)
        roc$sensitivities <- perfs$se
        roc$specificities <- perfs$sp
        auc.roc(roc, partial.auc = attr(roc$auc, "partial.auc"), 
            partial.auc.focus = attr(roc$auc, "partial.auc.focus"), 
            partial.auc.correct = attr(roc$auc, "partial.auc.correct"), 
            allow.invalid.partial.auc.correct = TRUE)

    The first two lines tell you that the stratified samples use 100% of the cases and controls because no size argument is given.

    > sample(1:10, replace=TRUE)
    # [1]  1 10  6  2  7  3  3  6  4  6

    Note that n in the function above is the number of bootstrap samples, not the size of the samples. The default is 2000 (boot.n).