rglmpoissonmodel-comparison

Issue with including a quadratic term into a GLM quasi-poisson


I am currently looking at country-wide population ‘censuses’ that were conducted by feeding vultures at all restaurant sites simultaneously. Simultaneous census counts were conducted twice in June each year, with 10–14 days between each occasion, in 2004, and from 2006 to 2023. During those events, three species of vultures are monitored and the number of individuals feeding is counted.

For each species, we fit one set of generalised linear models. For the census data we modelled the total count at all sites as a function of either a linear or quadratic effect of year, and an additive fixed effect of sampling occasions, i.e. the first or second of the two census occasions per year. Census occasion may be important if birds that were present at the first census occasion may not disperse widely after feeding and are more likely to be present at the second occasion. When looking at the degree of freedom in the results, I feel like the quadratic term is not accurately taken into account into the model.

Here is a subset of the data we are working on:

    Date    RHV SBV WRV Total   Census  Month   Year    Season  Occasion
6/10/2004   40  34  88  162 Yes 6   2004    Wet 1
6/20/2004   42  25  90  157 Yes 6   2004    Wet 2
6/10/2005   58  27  149 234 Yes 6   2005    Wet 1
6/20/2005   32  31  83  146 Yes 6   2005    Wet 2
6/10/2007   35  24  160 219 Yes 6   2007    Wet 1
6/20/2007   40  26  150 216 Yes 6   2007    Wet 2
6/10/2008   48  30  113 191 Yes 6   2008    Wet 1
6/20/2008   44  51  191 286 Yes 6   2008    Wet 2
6/10/2009   11  84  50  145 Yes 6   2009    Wet 1
6/20/2009   13  30  209 252 Yes 6   2009    Wet 2

Here are the results the results of summary(data) , the column Date_Num was not used for this analysis (Date_Num corresponds to the number of days, taking as baseline the 1st of June 2004).

      Date                 RHV             SBV             WRV             Total                      Month  
 Min.   :2004-06-10   Min.   :10.00   Min.   :15.00   Min.   : 42.00   Min.   : 70.0         
 1st Qu.:2009-09-16   1st Qu.:15.25   1st Qu.:28.50   1st Qu.: 71.25   1st Qu.:122.0    
 Median :2014-06-15   Median :21.00   Median :35.00   Median : 89.00   Median :155.0   
 Mean   :2014-05-07   Mean   :25.89   Mean   :37.29   Mean   :104.24   Mean   :167.4                        
 3rd Qu.:2019-03-13   3rd Qu.:38.00   3rd Qu.:44.50   3rd Qu.:136.00   3rd Qu.:214.8                        
 Max.   :2023-06-16   Max.   :58.00   Max.   :84.00   Max.   :209.00   Max.   :289.0                       
      Year        Occasion      Date_Num   
 Min.   :2004   Min.   :1.0   Min.   :   9  
 1st Qu.:2009   1st Qu.:1.0   1st Qu.:1934  
 Median :2014   Median :1.5   Median :3666  
 Mean   :2014   Mean   :1.5   Mean   :3627  
 3rd Qu.:2019   3rd Qu.:2.0   3rd Qu.:5398  
 Max.   :2023   Max.   :2.0   Max.   :6954 

Here is the line for the GLM we used

modelRHVDate <- glm(Species ~ Year + I(Year^2) + Occasion + Year:Occasion + I(Year^2):Occasion , 
                family = quasipoisson(link = "log"), 
                data = data)
summary(modelRHVDate)

Here are the results:

Call:
glm(formula = Species ~ Year + I(Year^2) + Occasion + Year:Occasion + 
    I(Year^2):Occasion, family = quasipoisson(link = "log"), 
    data = data)

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)
(Intercept)         2.560e+04  2.341e+04   1.094    0.282
Year               -2.535e+01  2.327e+01  -1.090    0.284
I(Year^2)           6.274e-03  5.780e-03   1.086    0.286
Occasion           -1.531e+04  1.475e+04  -1.037    0.307
Year:Occasion       1.519e+01  1.466e+01   1.036    0.308
I(Year^2):Occasion -3.769e-03  3.641e-03  -1.035    0.308

(Dispersion parameter for quasipoisson family taken to be 2.874796)

    Null deviance: 228.254  on 37  degrees of freedom
Residual deviance:  99.214  on 32  degrees of freedom
AIC: NA

Number of Fisher Scoring iterations: 4

Models were fitted with quasi-Poisson errors using a log-link function. The best models were selected by the lowest quasi-AICc score, and we demonstrate the level of support for the top models by reporting QAICc weight. To calculate QAICc scores and weights we also re-ran all models with standard Poisson errors, and corrected AICc scores using the dispersion from the most complex model (date2 + occasion) using the R-package bbmle (Bolker 2016).

Here a few lines from the code used to create the models with standard Poisson errors:

#model quasi-poisson
modelQP_RHV_Year <- glm(Species ~ Year, family = quasipoisson(link = "log"), data = data)
modelQP_RHV_Year2 <- glm(Species ~ I(Year^2), family = quasipoisson(link = "log"), data = data)
#model poisson
modelP_RHV_Year <-  update(modelQP_RHV_Year, family=poisson)
modelP_RHV_Year2 <- update(modelQP_RHV_Year2, family=poisson)

Here are the results:

                          dqAICc df Weight
modelP_RHV_Year            0.0   2   0.4  
modelP_RHV_Year2           0.0   2   0.4  
modelP_RHV_Year_Occasion   1.9   3   0.1  
modelP_RHV_Year2_Occasion  1.9   3   0.1  
modelP_RHV_Occasion       42.4   2   0.0  

Looking at the results, I was expected that the degree of freedom of my quadratic term for the Year would be of 3 and Year2 + Occasion would be of 4. My quadratic term appears to not be taken well into account in the model. I wrapped the quadratic term in I() as the ^ operator has a special meaning (not its mathematical one) in an R model formula but apparently it is not the correct way to do it.


Solution

  • If I understand your setup correctly, the problem is that I(Year^2) includes only the quadratic term and not the associated linear term.