rpredictionbayesianbrmsleave-one-out

How to calculate mean absolute error for Bayesian modeling based on Leave-One-Out Cross-Validation with Brms package in R


I am fitting a Bayesian model to predict a test score using the Brms package. I would like to know how to calculate the 'Mean Absolute Error' based on 'Leave-One-Out Cross-Validation' (LOO) using the LOO package, but I could not find any information related to how to actualize it by myself.

I would really appreciate if somebody demonstrate me how to calculate MAE based on LOO.

Here is a replicable sample code:

set.seed(123) # for reproducibility
n <- 100 # number of observations
predictor_1 <- rnorm(n)
predictor_2 <- rnorm(n)
test_score <- 5 + 2*predictor_1 + 3*predictor_2 + rnorm(n)
data <- data.frame(test_score, predictor_1, predictor_2)
head(data)

fit <- brm(test_score ~ predictor_1 + predictor_2, data = data)
predicted_test_score <- predict(fit)

# calculate mean absolute error
mae <- mean(abs(predicted_test_score - data$test_score))
mae

Solution

  • The function brms::kfold_predict() can be used for this and there is an easily adaptable example in the help file, see ?brms::kfold_predict().

    Perform LOOCV and save the fits (object size grows quickly with nobs!)

    fit_LOO <- kfold(fit, save_fits = TRUE, chains = 1, folds = "loo")
    

    Obtain samples (1000 is the default) from the posterior predictive distribution of each of the 100 observations. This yields a 1000x100 matrix.

    predict_LOO <- kfold_predict(fit_LOO, method = "predict")
    
    > dim(predict_LOO$yrep)
    [1] 1000  100
    

    Define and compute your desired loss (based on posterior means):

    MAE <- function(y, yrep) {
      yrep_mean <- colMeans(yrep)
      mean(abs(yrep_mean - y))
    }
    
    > MAE(y = predict_LOO$y, yrep = predict_LOO$yrep)
    [1] 0.7846185