I am working with data of vegetation cover (proportions) for different height strata (0-5, 5-15, 15-30, >30 cm, and also bare ground) amongst four different sites (sitio) and two different time periods (epoca: breeding and non breeding season). I went with GLM using the beta distribution (glmmTMB
) and then used emmeans
. In this question I showed the model I am using and had my interpretation problems solved.
Now I want to know how can I compress or normalize my data columns to exclude 0 and 1 values, since I can't run the model for some variables that include 0 values (e.g. 0-5 cm vegetation cover):
beta_sd <- glmmTMB(X0.5 ~ sitio * epoca,
+ data = vege2,
+ family = beta_family)
Error in eval(family$initialize) : y values must be 0 < y < 1
You could replace
the respective probabilities with something very small or large respectively.
vege2$X0.5 <- with(vege2, replace(X0.5, X0.5 == 0, .0001))
vege2$X0.5 <- with(vege2, replace(X0.5, X0.5 == 1, .9999))
glmmTMB::glmmTMB(X0.5 ~ sitio * epoca, data=vege2, family=glmmTMB::beta_family) |>
summary() |> coef() |> base::`[[`('cond')
# Estimate Std. Error z value Pr(>|z|)
# (Intercept) -1.188736 2.1352218 -0.5567274 0.5777137
# sitio 0.486339 0.6298129 0.7721961 0.4399983
# epoca 0.862026 1.0322244 0.8351149 0.4036530
# sitio:epoca -0.276354 0.3023759 -0.9139419 0.3607474
Data:
vege2 <- expand.grid(sitio=1:5, epoca=1:3)
set.seed(42)
vege2$X0.5 <- runif(nrow(vege2))
vege2$X0.5[c(2, 4, 6)] <- c(0, 1, 1)