I am currently looking at country-wide population ‘censuses’ that were conducted by feeding vultures at all restaurant sites simultaneously. Simultaneous census counts were conducted twice in June each year, with 10–14 days between each occasion, in 2004, and from 2006 to 2023. During those events, three species of vultures are monitored and the number of individuals feeding is counted.
For each species, we fit one set of generalised linear models. For the census data we modelled the total count at all sites as a function of either a linear or quadratic effect of year, and an additive fixed effect of sampling occasions, i.e. the first or second of the two census occasions per year. Census occasion may be important if birds that were present at the first census occasion may not disperse widely after feeding and are more likely to be present at the second occasion. When looking at the degree of freedom in the results, I feel like the quadratic term is not accurately taken into account into the model.
Here is a subset of the data we are working on:
Date RHV SBV WRV Total Census Month Year Season Occasion
6/10/2004 40 34 88 162 Yes 6 2004 Wet 1
6/20/2004 42 25 90 157 Yes 6 2004 Wet 2
6/10/2005 58 27 149 234 Yes 6 2005 Wet 1
6/20/2005 32 31 83 146 Yes 6 2005 Wet 2
6/10/2007 35 24 160 219 Yes 6 2007 Wet 1
6/20/2007 40 26 150 216 Yes 6 2007 Wet 2
6/10/2008 48 30 113 191 Yes 6 2008 Wet 1
6/20/2008 44 51 191 286 Yes 6 2008 Wet 2
6/10/2009 11 84 50 145 Yes 6 2009 Wet 1
6/20/2009 13 30 209 252 Yes 6 2009 Wet 2
Here are the results the results of summary(data)
, the column Date_Num was not used for this analysis (Date_Num corresponds to the number of days, taking as baseline the 1st of June 2004).
Date RHV SBV WRV Total Month
Min. :2004-06-10 Min. :10.00 Min. :15.00 Min. : 42.00 Min. : 70.0
1st Qu.:2009-09-16 1st Qu.:15.25 1st Qu.:28.50 1st Qu.: 71.25 1st Qu.:122.0
Median :2014-06-15 Median :21.00 Median :35.00 Median : 89.00 Median :155.0
Mean :2014-05-07 Mean :25.89 Mean :37.29 Mean :104.24 Mean :167.4
3rd Qu.:2019-03-13 3rd Qu.:38.00 3rd Qu.:44.50 3rd Qu.:136.00 3rd Qu.:214.8
Max. :2023-06-16 Max. :58.00 Max. :84.00 Max. :209.00 Max. :289.0
Year Occasion Date_Num
Min. :2004 Min. :1.0 Min. : 9
1st Qu.:2009 1st Qu.:1.0 1st Qu.:1934
Median :2014 Median :1.5 Median :3666
Mean :2014 Mean :1.5 Mean :3627
3rd Qu.:2019 3rd Qu.:2.0 3rd Qu.:5398
Max. :2023 Max. :2.0 Max. :6954
Here is the line for the GLM we used
modelRHVDate <- glm(Species ~ Year + I(Year^2) + Occasion + Year:Occasion + I(Year^2):Occasion ,
family = quasipoisson(link = "log"),
data = data)
summary(modelRHVDate)
Here are the results:
Call:
glm(formula = Species ~ Year + I(Year^2) + Occasion + Year:Occasion +
I(Year^2):Occasion, family = quasipoisson(link = "log"),
data = data)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.560e+04 2.341e+04 1.094 0.282
Year -2.535e+01 2.327e+01 -1.090 0.284
I(Year^2) 6.274e-03 5.780e-03 1.086 0.286
Occasion -1.531e+04 1.475e+04 -1.037 0.307
Year:Occasion 1.519e+01 1.466e+01 1.036 0.308
I(Year^2):Occasion -3.769e-03 3.641e-03 -1.035 0.308
(Dispersion parameter for quasipoisson family taken to be 2.874796)
Null deviance: 228.254 on 37 degrees of freedom
Residual deviance: 99.214 on 32 degrees of freedom
AIC: NA
Number of Fisher Scoring iterations: 4
Models were fitted with quasi-Poisson errors using a log-link function. The best models were selected by the lowest quasi-AICc score, and we demonstrate the level of support for the top models by reporting
QAICc weight. To calculate QAICc scores and weights we also re-ran all models with standard Poisson errors, and corrected AICc scores using the dispersion from the most complex model (date2 + occasion) using the R-package bbmle
(Bolker 2016).
Here a few lines from the code used to create the models with standard Poisson errors:
#model quasi-poisson
modelQP_RHV_Year <- glm(Species ~ Year, family = quasipoisson(link = "log"), data = data)
modelQP_RHV_Year2 <- glm(Species ~ I(Year^2), family = quasipoisson(link = "log"), data = data)
#model poisson
modelP_RHV_Year <- update(modelQP_RHV_Year, family=poisson)
modelP_RHV_Year2 <- update(modelQP_RHV_Year2, family=poisson)
Here are the results:
dqAICc df Weight
modelP_RHV_Year 0.0 2 0.4
modelP_RHV_Year2 0.0 2 0.4
modelP_RHV_Year_Occasion 1.9 3 0.1
modelP_RHV_Year2_Occasion 1.9 3 0.1
modelP_RHV_Occasion 42.4 2 0.0
Looking at the results, I was expected that the degree of freedom of my quadratic term for the Year would be of 3 and Year2 + Occasion would be of 4. My quadratic term appears to not be taken well into account in the model. I wrapped the quadratic term in I() as the ^ operator has a special meaning (not its mathematical one) in an R model formula but apparently it is not the correct way to do it.
If I understand your setup correctly, the problem is that I(Year^2)
includes only the quadratic term and not the associated linear term.
Species ~ Year
would correspond to an underlying linear model b0 + b1*Year
(2 parameters/df) (this is the same as Species ~ 1 + Year
; the 1+
corresponding to the intercept (b0
) is always added to a formula unless you explicitly suppress it with -1
or +0
in the formula)Species ~ 1 + Year + I(Year^2)
(3 df) would correspond to the quadratic model including all three terms, b0 + b1*Year + b2*Year^2
Species ~ 1 + I(Year^2)
(2 df) corresponds to b0 + b2*Year^2
and is a model you should rarely consider unless you have a specific reason to exclude the linear term from the modeldata$cYear <- data$Year - min(data$Year)
to have Year
start at zero in your data set), to improve numerical stability and make the intercept interpretablepoly(Year, 2)
(for an orthogonal polynomial) or poly(Year, 2, raw = TRUE)
(for a standard polynomial) will automatically include the linear and quadratic terms. I would write your full model out as Species ~ Occasion*poly(Year, 2, raw = TRUE)