Consider this simple example:
data_frame(truth = c(1,1,0,0),
prediction = c(1,0,1,0),
n_obs = c(100,10,90,50))
# A tibble: 4 x 3
truth prediction n_obs
<dbl> <dbl> <dbl>
1 1 1 100
2 1 0 10
3 0 1 90
4 0 0 50
I would like to pass this tibble
to caret::confusionMatrix
so that I have all the metrics I need at once (accuracy
, recall
, etc).
As you can see, the tibble
contains all the information required to compute performance statistics. For instance, you can see that in the test dataset (not available here), there are 100 observations where the predicted label 1
matched the true label 1
. However, 90
observations with a predicted value of 1
were actually false positives.
I do not want to compute all the metrics by hand, and would like to resort to caret::confusionMatrix()
However, this has proven to be suprisingly difficult. Calling confusionMatrix(.)
on the tibble
above does not work. Is there any solution here?
Thanks!
You could use the following. You have to set the positive class to 1 otherwise 0 will be taken as the positive class.
confusionMatrix(xtabs(n_obs ~ prediction + truth , df), positive = "1")
Confusion Matrix and Statistics
truth
prediction 0 1
0 50 10
1 90 100
Accuracy : 0.6
95% CI : (0.5364, 0.6612)
No Information Rate : 0.56
P-Value [Acc > NIR] : 0.1128
Kappa : 0.247
Mcnemar's Test P-Value : 2.789e-15
Sensitivity : 0.9091
Specificity : 0.3571
Pos Pred Value : 0.5263
Neg Pred Value : 0.8333
Prevalence : 0.4400
Detection Rate : 0.4000
Detection Prevalence : 0.7600
Balanced Accuracy : 0.6331
'Positive' Class : 1