I am fitting a Bayesian model to predict a test score using the Brms
package.
I would like to know how to calculate the 'Mean Absolute Error' based on 'Leave-One-Out Cross-Validation' (LOO) using the LOO
package, but I could not find any information related to how to actualize it by myself.
I would really appreciate if somebody demonstrate me how to calculate MAE based on LOO.
Here is a replicable sample code:
set.seed(123) # for reproducibility
n <- 100 # number of observations
predictor_1 <- rnorm(n)
predictor_2 <- rnorm(n)
test_score <- 5 + 2*predictor_1 + 3*predictor_2 + rnorm(n)
data <- data.frame(test_score, predictor_1, predictor_2)
head(data)
fit <- brm(test_score ~ predictor_1 + predictor_2, data = data)
predicted_test_score <- predict(fit)
# calculate mean absolute error
mae <- mean(abs(predicted_test_score - data$test_score))
mae
The function brms::kfold_predict()
can be used for this and there is an easily adaptable example in the help file, see ?brms::kfold_predict()
.
Perform LOOCV and save the fits (object size grows quickly with nobs!)
fit_LOO <- kfold(fit, save_fits = TRUE, chains = 1, folds = "loo")
Obtain samples (1000 is the default) from the posterior predictive distribution of each of the 100 observations. This yields a 1000x100 matrix.
predict_LOO <- kfold_predict(fit_LOO, method = "predict")
> dim(predict_LOO$yrep)
[1] 1000 100
Define and compute your desired loss (based on posterior means):
MAE <- function(y, yrep) {
yrep_mean <- colMeans(yrep)
mean(abs(yrep_mean - y))
}
> MAE(y = predict_LOO$y, yrep = predict_LOO$yrep)
[1] 0.7846185