I am using the glmnet
package in R, and not(!) the caret
package for my binary ElasticNet regression. I have come to the point where I would like to compare models (e.g. lambda set to lambda.1se
or lambda.min
, and models where k-fold
is set to 5 or 10). But, I have not yet achieved to compute the AICc
or BIC
for my models. How do I do that? I tried this and this but it did not work for me, I only get an empty list.
Code:
set.seed(123)
foldid <- sample(rep(seq(10), length.out = nrow(x.train)))
list.of.fits.df <- list()
for (i in 0:10){
fit.name <- paste0("alpha", i/10)
list.of.fits.df[[fit.name]] <- cv.glmnet(x.train, y.train, type.measure = c("auc"), alpha = i/10, family = "binomial", nfolds = 10, foldid = foldid, parallel = TRUE)
}
best.fit <- coef(list.of.fits.df[[fit.name]], s = list.of.fits.df[[fit.name]]$lambda.1se)
best.fit.min <- coef(list.of.fits.df[[fit.name]], s = list.of.fits.df[[fit.name]]$lambda.min)
#AICc & BIC
#???
How can I find the AICc
and BIC
for my best fit model?
You can alter the solution given in this answer slightly to obtain the desired result The reason it doesn't work "out of the box" is that the cv.glmnet
function returns the result of several fits, but the individual results are stored in x$glmnet.fit
, and we can use this to create a simple function for calculating AICc
and BIC
.
glmnet_cv_aicc <- function(fit, lambda = 'lambda.1se'){
whlm <- which(fit$lambda == fit[[lambda]])
with(fit$glmnet.fit,
{
tLL <- nulldev - nulldev * (1 - dev.ratio)[whlm]
k <- df[whlm]
n <- nobs
return(list('AICc' = - tLL + 2 * k + 2 * k * (k + 1) / (n - k - 1),
'BIC' = log(n) * k - tLL))
})
}
All we'll then have to do is provide the model and get our estimated AICc
.
best.aicc <- glmnet_cv_aicc(list.of.fits.df[[fit.name]])
best.aicc.min <- glmnet_cv_aicc(list.of.fits.df[[fit.name]], 'lambda.min')
For a reproducible example, one could use one of the many examples provided in help(glmnet)
n = 500
p = 30
nzc = trunc(p/10)
x = matrix(rnorm(n * p), n, p)
beta3 = matrix(rnorm(30), 10, 3)
beta3 = rbind(beta3, matrix(0, p - 10, 3))
f3 = x %*% beta3
p3 = exp(f3)
p3 = p3/apply(p3, 1, sum)
g3 = glmnet:::rmult(p3)
set.seed(10101)
cvfit = cv.glmnet(x, g3, family = "multinomial")
print(glmnet_cv_aicc(cvfit))
# Output
#$AICc
#[1] -556.2404
#
#$BIC
#[1] -506.3058
print(glmnet_cv_aicc(cvfit, 'lambda.min'))
# Output
#$AICc
#[1] -601.0234
#
#$BIC
#[1] -506.4068