rr-markdownconfusion-matrixxtable

Merge several confusion matrix with xtable


I have trained several models and want to summarise their performance with three confusion matrix. What I want to do is to combine three different confusion matrix into one table using xtable. I want to combine table 1, 2 and 3. See an example below using XGBoost.

require(xgboost)
require(xtable)
require(caTools)
require(tidyverse)

set.seed(1234)

# Loading data
x1 = c(rnorm(10000, 0,1), rnorm(10000,3,1))
x2 = rnorm(1000)
x3 = rnorm(1000)
class= factor(rep(0:1, each=10000))

df <- as.data.frame(cbind(x1, x2, x3, class))

# Preparing target variable
df$class <- as.numeric(df$class)
df$class <- df$class -1

# Creating a hold-out data
train <- sample.split(df$class, SplitRatio = 0.70)
train.df <- subset(df, train == TRUE)
test.df <- subset(df, train == FALSE)

#Labels. 
labels.train <- train.df[c('class')]
labels.test <- test.df[c('class')]

# Dropping target variable.
train.df <- train.df %>%
  dplyr::select(-class)

test.df <- test.df %>%
  dplyr::select(-class)

# Converting to appropiate format. 
train <- xgb.DMatrix(as.matrix(train.df), label = as.matrix(labels.train))
test <- xgb.DMatrix(as.matrix(test.df), label = as.matrix(labels.test))

watchlist <- list(eval = test, train = train)

# Running the model
model <- xgb.train(data=train,
                   watchlist = watchlist,
                   nround = 1000, 
                   early_stopping_rounds = 25,                 
                   objective = "binary:logistic")

# Predictions
pred <- predict(model, test)

# Evaluating the p-distribution. 
hist(pred)

# Confusion matrix
table1 <- table(pred > 0.5, labels.test$class)
table2 <- table(pred > 0.25, labels.test$class)
table3 <- table(pred > 0.75, labels.test$class)

print(xtable(table1, caption = 'Threshhold = 50%'))
print(xtable(table2, caption = 'Threshhold = 25%'))
print(xtable(table3, caption = 'Threshhold = 75%'))

The outcome now looks like this

enter image description here

but I want it to look something like this

enter image description here


Solution

  • A possible solution using kable() from knitr, add_header_above() and kable_styling() from kableExtra is next. Add this code after creating the confusion matrices:

    #Format table
    t1 <- as.data.frame.matrix(table1)
    t2 <- as.data.frame.matrix(table2)
    t3 <- as.data.frame.matrix(table3)
    #Bind
    tm <- cbind(t1,t2,t3)
    

    Then next code produces the output you want:

    kable(tm,"latex",longtable =T,booktabs =T,caption ="Longtable")%>%
      add_header_above(c(" ","p=50%"=2,"p=25%"=2,"p=75%"=2))%>%
      kable_styling(latex_options =c("repeat_header"))
    

    I have run the previous code in a rmarkdown document and the result is next:

    enter image description here

    You must also add libraries knitr and kableExtra to your code.