rmixed-modelslasso-regressionrandom-effectstweedie

Can glmmLasso be used with the Tweedie distribution?


I have a linear mixed effects model and I am trying to do variable selection. The model is testing the level of forest degradation in 1000 sampled points. Most points have no degradation, and so the dependent variable is highly skewed with many zeros. Therefore, I am using the Tweedie distribution to fit the model. My main question is: can the Tweedie distribution actually be used in the glmmLasso function? My second question is: do I even need to use this distribution in glmmLasso()? Any help is much appreciated!

When I run the function with family = tweedie(var.power=1.2,link.power=0) I get the following error:

Error in logLik.glmmLasso(y = y, yhelp = yhelp, mu = mu, family = family,  : 
  object 'loglik' not found

If I change the link.power from 0 to 1 (which I think is not correct for my model, but just for the sake of figuring out the problem), I get a different error:

Error in grad.lasso[b.is.0] <- score.beta[b.is.0] - lambda.b * sign(score.beta[b.is.0]) : 
  NAs are not allowed in subscripted assignments

Here tweedie comes from the statmod package. A simple example:

library(tweedie)
library(tidyverse)
library(glmmLasso)
library(statmod)

power <- 2
mu <- 1
phi <- seq(2, 8, by=0.1)
set.seed(10000)
y <- rtweedie( 100, mu=mu, power=power, phi=3)

x <- rnorm(100)
z <- c(rep(1, 50), rep(2,50))
df = as.data.frame(cbind(y,x,z))
df$z = as.factor(df$z)
f = y ~ x

varSelect = glmmLasso(fix = f, rnd = list(z=~1), data = df, 
                      lambda = 5, family = tweedie(var.power=1.2,link.power=0))


Solution

  • I created a hacked version of glmmLasso that incorporates the Tweedie distribution as an option and put it on Github. I had to change two aspects of the code:

    remotes::install_github("bbolker/glmmLasso-bmb")
    packageVersion("glmmLasso")
    ## [1] ‘1.6.2.9000’
    

    Your example runs for me now, but I haven't checked at all to see if the results are sensible.