rroc

pROC - How to Get Confidence Intervals or Generate a Confusion Matrix


I used the pROC package to do an ROC analysis. It gave me the sensitivities, specificities, etc.

The journal is requesting 95% confidence intervals for every statistic provided. I see I can do that in the epiR package, but I have to give it a confusion matrix.

How do I use the threshold provided from the pROC to get a confusion matrix?

Sample data and code:

library(pROC)
library(tibble)

data<-tribble(
  ~death, ~score,
  0, 0.132,
  1, 0.19, 
  0, 0.03,
  1, 0.131,
  0, 0.02
)

roc<-roc(data$death, data$score, smoothed = TRUE,
              ci=TRUE, ci.alpha=0.95, stratified=FALSE,
              plot=TRUE, auc.polygon=TRUE, max.auc.polygon=TRUE, grid=TRUE,
              print.auc=TRUE, show.thres=TRUE)

coords(roc, x="best", ret=c("threshold", "specificity", "sensitivity", "accuracy",
                                 "precision", "recall", "tpr", "ppv", "fpr"))

Solution

  • To generate a confusion matrix, you first need to assign predicted outcomes (predicted death, predicted survived) according to a threshold. The AUC is calculated over every possible threshold in your data. In my example I have arbitrarily selected the second lowest threshold to generate the example

    #first assign a threshold
    thres <- roc$thresholds[2]
    
    #assign labels to your data according to the threshold
    data$predicted_death <- data$score > thres
    
    #convert to character vector to facilitate interpretation
    data$predicted_death <-ifelse(data$predicted_death==1, "predicted_dead", "predicted_alive")
    data$death <- ifelse(data$death==1, "dead", "alive")
    
    #count the true positives, false positives, false negatives and true negatives in a confusion matrix using the R function table()
    cm <- table(data$death, data$predicted_death)
    

    I would advise choosing a threshold to optimise both of sensitivity and specificity, such as the youden index.