I am attempting to a run a zero-inflated negative binomial mixed model. I ran an experiment where I gave adult snails a phosphorus-enriched diet and a phosphorous-depleted diets (P
). These snails all come from different families; there are 10 families (Lineages
). We also have 3 ploidy levels (diploid, triploid, tetraploid). Lineage is a random factor and will always be nested underneath a ploidy level. For example, Lineage A can only be diploid. It can never be triploid nor tetraploid. My dependent variable is embryo count, which is zero-inflated and the highest embryo number is 49. We measured adult length (Length
) since it has a significant linear relationship with embryo number.
This is my code:
model5 <- zeroinfl(Embryonumber ~ (Ploidy/1|Lineage) + P:Ploidy + Length,
data = x, dist = "negbin")
summary(model5)
What I think I have asked R to run is a zero-inflated negative binomial mixed model. Dependent is embryo number. Ploidy and P are fixed factors and I have included them as stand-alone factors and their interaction. Length is a covariate. Lineage is random and nested in Ploidy - Lineage cannot and should not be stand alone.
But I receive this error:
Error in Ploidy/1 : non-numeric argument to binary operator
I have 2 ideas here: 1) the format of my data is wrong or 2) coded a random factor nested in a fixed factor wrong.
When I run summary of my data (summary (data)
) I receive:
Lineage Ploidy P SnailLetterSymbol Length Embryonumber
Length:99 Length:99 Length:99 Length:99 Min. :3.500 Min. : 0.000
Class :character Class :character Class :character Class :character 1st Qu.:4.000 1st Qu.: 2.000
Mode :character Mode :character Mode :character Mode :character Median :4.100 Median : 6.000
Mean :4.219 Mean : 9.364
3rd Qu.:4.500 3rd Qu.:14.000
Max. :5.000 Max. :49.000
Presence Standardized residuals
Min. :0.0000 Min. :-1.405440
1st Qu.:1.0000 1st Qu.:-0.706935
Median :1.0000 Median :-0.299980
Mean :0.7879 Mean :-0.000001
3rd Qu.:1.0000 3rd Qu.: 0.593960
Max. :1.0000 Max. : 3.461570
Do I need to recode my Ploidy or Lineage as a vector?
Here are the two columns in question, but let me know if you need more:
I am unsure if I used the correct syntax for nesting and random factors. I see most vignettes that random and nested factors always have the (1 | in front of both factors. I have not done this, because Ploidy is not random only Lineage is. I have taken advice on how to nest a random factor from: 1) https://stats.stackexchange.com/questions/228800/crossed-vs-nested-random-effects-how-do-they-differ-and-how-are-they-specified?noredirect=1&lq=1 2) https://www.muscardinus.be/statistics/nested.html 3) https://rdoodles.rbind.io/2019/11/nested-random-factors-in-mixed-multilevel-or-hierarchical-models/ 4) https://ourcodingclub.github.io/tutorials/mixed-models/#nested
I then attempted to use long-hand syntax (as described in link 2) but I receive an error. Code:
model5 <- zeroinfl(Embryonumber ~ (Ploidy:1|Lineage) + P:Ploidy + Length,
data = x, dist = "negbin")
summary(model5)
Error: Error in Ploidy:1 : NA/NaN argument
Do you all have any advice? Thank you!
An update: I have attempted the below code and I still receive the Error in Ploidy/Lineage : non-numeric argument to binary operator.
model4 <- zeroinfl(Embryonumber ~ (1 | Ploidy / Lineage) + P:Ploidy + Length| Presence,
data = x, dist = "negbin")
summary(model4)
As s commenter suggested I removed the nested factor, which is not an appropriate model for my data, but I wanted to at least test that the function works. Code:
model4 <- zeroinfl(Embryonumber ~ (1 | Lineage) + (1 | Ploidy) + P:Ploidy + Length| Presence,
data = x, dist = "negbin")
summary(model4)
Error:
Error in 1 | Lineage :
operations are possible only for numeric, logical or complex types
zeroinfl
doesn't handle random effects, so the only bar (|
) in your formula should be the one splitting the conditional from the zero-inflation model, e.g.Embryonumber ~ Lineage + P*Ploidy + Length| Presence
glmmTMB
, e.g.library(glmmTMB)
glmmTMB(Embryonumber ~ P*Ploidy + Length + (1|Lineage),
ziformula = ~ Presence,
data = x, family = "nbinom2")
Ploidy
as a random effect grouping variable; for a random effect grouping variable, you need (1) enough levels to estimate an among-group variance (≥ 5 is a typical rule of thumb) and (2) you need the levels to be exchangeable, i.e. the labels of the levels could be switched without changing the meaning of the model, which I suspect is not true for ploidy.