So We are triying to fit a glmm to my_data. my_data
Figure Figure resume the descriptive statistic of my_data. D0.Richness is unbalanced (n=sample size for each Group1 level) as we can see in figure. Then we fit data in nlme:
m.1.a = lme(D0_.Riqueza.~Tratamento+SistemaTempoTemperatura, random = ~1|Tank/Sample, data = my_data, control =list(msMaxIter = 1000, msMaxEval = 1000))
Linear mixed-effects model fit by REML Data: my_data AIC BIC logLik 728.1183 775.5072 -344.0591 Random effects: Formula: ~1 | Tank (Intercept) StdDev: 9.578657 Formula: ~1 | Sample %in% Tank (Intercept) Residual StdDev: 14.19689 5.685644 Fixed effects: D0_.Riqueza. ~ Tratamento + Sistema * Tempo * Temperatura Value Std.Error DF t-value p-value (Intercept) -883.5998 1498.9767 72 -0.5894687 0.5574 TratamentoControl -0.3201 7.8723 7 -0.0406651 0.9687 SistemaBiofloco 467.9542 1508.0596 7 0.3103022 0.7654 TempoDay06 1729.2678 1045.7843 72 1.6535607 0.1026 TempoDay11 890.5315 1459.7221 72 0.6100692 0.5437 TempoDay30 575.8523 1283.1337 72 0.4487859 0.6549 Temperatura 28.1173 46.0317 72 0.6108247 0.5432 SistemaBiofloco:TempoDay06 -1090.8551 1069.6759 72 -1.0197996 0.3112 SistemaBiofloco:TempoDay11 -10.2688 1477.1431 72 -0.0069518 0.9945 SistemaBiofloco:TempoDay30 -210.7544 1297.7599 72 -0.1623986 0.8714 SistemaBiofloco:Temperatura -14.3765 46.2605 72 -0.3107720 0.7569 TempoDay06:Temperatura -59.7401 31.9417 72 -1.8702879 0.0655 TempoDay11:Temperatura -26.9311 50.7368 72 -0.5308009 0.5972 TempoDay30:Temperatura -16.4391 43.8698 72 -0.3747247 0.7090 SistemaBiofloco:TempoDay06:Temperatura 39.3035 32.9882 72 1.1914422 0.2374 SistemaBiofloco:TempoDay11:Temperatura -0.7773 51.3028 72 -0.0151506 0.9880 SistemaBiofloco:TempoDay30:Temperatura 6.3729 44.3377 72 0.1437347 0.8861 Correlation: (Intr) TrtmnC SstmBf TmpD06 TmpD11 TmpD30 Tmprtr SsB:TD06 SsB:TD11 SsB:TD30 SstB:T TD06:T TratamentoControl 0.425 SistemaBiofloco -0.992 -0.419 TempoDay06 -0.786 -0.230 0.781 TempoDay11 -0.162 0.072 0.162 0.341 TempoDay30 -0.171 0.090 0.170 0.381 0.357 Temperatura -1.000 -0.428 0.992 0.786 0.162 0.170 SistemaBiofloco:TempoDay06 0.768 0.223 -0.774 -0.977 -0.334 -0.373 -0.768 SistemaBiofloco:TempoDay11 0.158 -0.076 -0.167 -0.336 -0.989 -0.354 -0.158 0.343 SistemaBiofloco:TempoDay30 0.165 -0.097 -0.178 -0.375 -0.354 -0.989 -0.165 0.382 0.362 SistemaBiofloco:Temperatura 0.992 0.419 -1.000 -0.781 -0.162 -0.170 -0.992 0.774 0.167 0.178 TempoDay06:Temperatura 0.607 0.123 -0.603 -0.968 -0.374 -0.421 -0.606 0.946 0.369 0.415 0.603 TempoDay11:Temperatura 0.032 -0.129 -0.032 -0.241 -0.991 -0.339 -0.031 0.236 0.980 0.336 0.032 0.298 TempoDay30:Temperatura 0.037 -0.150 -0.037 -0.279 -0.340 -0.991 -0.036 0.273 0.337 0.981 0.037 0.344 SistemaBiofloco:TempoDay06:Temperatura -0.587 -0.119 0.593 0.937 0.362 0.408 0.587 -0.969 -0.372 -0.417 -0.593 -0.968 SistemaBiofloco:TempoDay11:Temperatura -0.030 0.132 0.037 0.238 0.981 0.336 0.029 -0.245 -0.992 -0.343 -0.037 -0.294 SistemaBiofloco:TempoDay30:Temperatura -0.033 0.156 0.045 0.274 0.337 0.981 0.033 -0.283 -0.344 -0.991 -0.045 -0.340 TD11:T TD30:T SB:TD06: SB:TD11: TratamentoControl SistemaBiofloco TempoDay06 TempoDay11 TempoDay30 Temperatura SistemaBiofloco:TempoDay06 SistemaBiofloco:TempoDay11 SistemaBiofloco:TempoDay30 SistemaBiofloco:Temperatura TempoDay06:Temperatura TempoDay11:Temperatura TempoDay30:Temperatura 0.339 SistemaBiofloco:TempoDay06:Temperatura -0.288 -0.333 SistemaBiofloco:TempoDay11:Temperatura -0.990 -0.336 0.298 SistemaBiofloco:TempoDay30:Temperatura -0.337 -0.991 0.343 0.343 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -0.964339023 -0.204337214 0.001215562 0.178330841 1.035617143 Number of Observations: 96 Number of Groups: Tank Sample %in% Tank 10 96 $ r.squaredGLMM(m.1.a)*100 R2m R2c [1,] 37.35081 93.78055
#"Sample" is a unique individual. Individuals were measured just once. It was considered random intercept of each sample nested in Tank. Eventually it was considered a random slope by Tempo (~Tempo|Tank/Sample), but nlme give a error message "Error in lme.formula(D0_.Riqueza. ~ Tratamento + Sistema * Tempo + Tempo * : fewer observations than random effects in all level 2 groups".
Question 1: Is valid the structure of our random effects (nesting each "Sample" within its respective "Tank")?
Question 2: Why the residual plot of "m.1.a" residual.m.1.a does display a linear beahvior? Is it because the random effect structure apllied?
So We tried with other random efects structure, avoid nesting:
m.1.b = lme(D0_.Riqueza.~Tratamento+SistemaTempoTemperatura, random = ~ Tempo|Tank, data = my_data, control =list(msMaxIter = 1000, msMaxEval = 1000))
Linear mixed-effects model fit by REML Data: my_data AIC BIC logLik 742.6198 808.9644 -343.3099 Random effects: Formula: ~Tempo | Tank Structure: General positive-definite, Log-Cholesky parametrization StdDev Corr (Intercept) 8.240640 (Intr) TmpD06 TmpD11 TempoDay06 5.070250 0.510 TempoDay11 6.103859 0.875 0.863 TempoDay30 6.964783 -0.416 -0.994 -0.804 Residual 14.848455 Fixed effects: D0_.Riqueza. ~ Tratamento + Sistema * Tempo * Temperatura Value Std.Error DF t-value p-value (Intercept) -699.2080 1353.0931 72 -0.5167479 0.6069 TratamentoControl 1.9563 7.2293 7 0.2706094 0.7945 SistemaBiofloco 300.5875 1360.6927 7 0.2209077 0.8315 TempoDay06 1659.6087 995.3994 72 1.6672791 0.0998 TempoDay11 920.8840 1701.0956 72 0.5413476 0.5899 TempoDay30 609.2401 1518.2923 72 0.4012667 0.6894 Temperatura 22.4262 41.5534 72 0.5396953 0.5911 SistemaBiofloco:TempoDay06 -1025.2037 1022.8656 72 -1.0022858 0.3196 SistemaBiofloco:TempoDay11 -31.1847 1717.5524 72 -0.0181565 0.9856 SistemaBiofloco:TempoDay30 -263.6907 1532.1944 72 -0.1721000 0.8638 SistemaBiofloco:Temperatura -9.2445 41.7400 72 -0.2214774 0.8253 TempoDay06:Temperatura -58.6019 32.7316 72 -1.7903776 0.0776 TempoDay11:Temperatura -28.8282 60.9389 72 -0.4730668 0.6376 TempoDay30:Temperatura -18.3361 50.9577 72 -0.3598304 0.7200 SistemaBiofloco:TempoDay06:Temperatura 38.1812 33.8937 72 1.1264979 0.2637 SistemaBiofloco:TempoDay11:Temperatura 0.6981 61.4697 72 0.0113569 0.9910 SistemaBiofloco:TempoDay30:Temperatura 8.8940 51.4277 72 0.1729427 0.8632 Correlation: (Intr) TrtmnC SstmBf TmpD06 TmpD11 TmpD30 Tmprtr SsB:TD06 SsB:TD11 SsB:TD30 SstB:T TD06:T TratamentoControl 0.433 SistemaBiofloco -0.992 -0.424 TempoDay06 -0.651 -0.222 0.646 TempoDay11 0.130 0.057 -0.129 0.319 TempoDay30 -0.358 0.070 0.357 0.218 0.023 Temperatura -1.000 -0.435 0.992 0.651 -0.130 0.358 SistemaBiofloco:TempoDay06 0.633 0.215 -0.639 -0.973 -0.311 -0.213 -0.633 SistemaBiofloco:TempoDay11 -0.129 -0.057 0.123 -0.316 -0.990 -0.023 0.129 0.324 SistemaBiofloco:TempoDay30 0.350 -0.081 -0.361 -0.214 -0.024 -0.992 -0.349 0.217 0.029 SistemaBiofloco:Temperatura 0.992 0.424 -1.000 -0.646 0.129 -0.357 -0.992 0.639 -0.123 0.361 TempoDay06:Temperatura 0.422 0.110 -0.419 -0.963 -0.428 -0.133 -0.421 0.937 0.423 0.131 0.419 TempoDay11:Temperatura -0.227 -0.099 0.225 -0.248 -0.995 0.013 0.227 0.242 0.986 -0.011 -0.225 0.378 TempoDay30:Temperatura 0.264 -0.118 -0.263 -0.157 -0.038 -0.995 -0.263 0.153 0.038 0.987 0.263 0.093 SistemaBiofloco:TempoDay06:Temperatura -0.408 -0.108 0.412 0.930 0.413 0.129 0.407 -0.964 -0.426 -0.133 -0.412 -0.966 SistemaBiofloco:TempoDay11:Temperatura 0.225 0.098 -0.219 0.246 0.987 -0.013 -0.225 -0.255 -0.995 0.007 0.219 -0.374 SistemaBiofloco:TempoDay30:Temperatura -0.256 0.129 0.267 0.152 0.038 0.987 0.256 -0.156 -0.043 -0.995 -0.267 -0.091 TD11:T TD30:T SB:TD06: SB:TD11: TratamentoControl SistemaBiofloco TempoDay06 TempoDay11 TempoDay30 Temperatura SistemaBiofloco:TempoDay06 SistemaBiofloco:TempoDay11 SistemaBiofloco:TempoDay30 SistemaBiofloco:Temperatura TempoDay06:Temperatura TempoDay11:Temperatura TempoDay30:Temperatura 0.011 SistemaBiofloco:TempoDay06:Temperatura -0.365 -0.090 SistemaBiofloco:TempoDay11:Temperatura -0.991 -0.011 0.378 SistemaBiofloco:TempoDay30:Temperatura -0.012 -0.992 0.093 0.016 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.42113917 -0.54063148 -0.02540128 0.43542432 2.86898537 Number of Observations: 96 Number of Groups: 10 > r.squaredGLMM(m.1.b)*100 R2m R2c [1,] 36.54991 58.24922
Actually, we compare both models:
anova(m.1.a, m.1.b) Model df AIC BIC logLik Test L.Ratio p-value m.1.a 1 20 728.1183 775.5072 -344.0591 m.1.b 2 28 742.6198 808.9644 -343.3099 1 vs 2 1.498451 0.9927
Despite there were no differences between models (although AIC and BIC was lower for m.1.a), residual plot (m.1.b) was different (and "better") residual.m.1.b.
Should we keep model m.1.a? If so, is there a way to understand and improve the residuals behavior of m.1.a? Any help will be appreciated.
Fitting a random effect where there's only one observation per block severely reduces the identifiability of your model: the random effect is directly competing with your residual error and nothing else. This is also the reason why you can't add a second random effect parameter, there's just no additional degrees of freedom to fit it - not that it would make any sense.
While you haven't explained your experimental setup or hypothesis in much detail the second model seems to make a lot more sense, where the random effect will handle the between-tank variability and the residual is subject within tank. In the second model you should also be able to add a random slope without issues (both for producing the fit and its interpretation).