I am trying to create a GLMM in R. I want to find out how the emergence time of bats depends on different factors. Here I take the time difference between the departure of the respective bat and the sunset of the day as dependent variable (metric). As fixed factors I would like to include different weather data (metric) as well as the reproductive state (categorical) of the bats. Additionally, there is the transponder number (individual identification code) as a random factor to exclude inter-individual differences between the bats.
I first worked in R with a linear mixed model (package lme4), but the QQ plot of the residuals deviates very strongly from the normal distribution. Also a histogram of the data rather indicates a gamma distribution. As a result, I implemented a GLMM with a gamma distribution. Here is an example with one weather parameter:
model <- glmer(formula = difference_in_min ~ repro + precipitation +
(1+repro|transponder number),
data = trip, control=ctrl, family=gamma(link = log))
However, since there was no change in the QQ plot this way, I looked at the residual diagnostics of the DHARMa
package. But the distribution assumption still doesn't seem to be correct, because the data in the QQ plot deviates very much here, too.
But if the data also do not correspond to a gamma distribution, what alternative is there? Or maybe the problem lies somewhere else entirely.
Does anyone have an idea where the error might lie?
But if the data also do not correspond to a gamma distribution, what alternative is there?
Gaussian (or normal) distributions are typically used for data that are normally distributed around zero, which sounds like you do not have. But the lognormal distribution does not have the same requirements. Following your previous code, you would fit it like this:
model <- glmer(formula = log(difference_in_min) ~ repro + precipitation + (1+repro|transponder number), data = trip, control=ctrl, family=gaussian(link = identity))
or instead of glmer
you can just call lmer
directly where you don't need to specify the distribution (which it may tell you to do in a warning message anyway:
model <- lmer(formula = log(difference_in_min) ~ repro + precipitation + (1+repro|transponder number), data = trip, control=ctrl)