mgcv

mgcv::gam does not correctly decompose the linear component from the smooth


I am interested in testing the linear trend separately from the nonlinear component in a GAMM.

I have followed an example from the mgcv documentation and answers to relevant questions on stackexchange to use m=c(2,0) with the default tp smooth. The example works according to my expectations: (1) the slope estimated for the parametric coefficient for x is pretty close to the estimate from simple linear regression; and (2) the smooth appears detrended and of the relevant (very small) magnitude.

However, when I try to do this with my data, the result makes no sense to me. It appears that the linear component is arbitrary (and occasionally not even with the correct sign, as in the following example) and the smooth is not detrended (it obviously includes a strong linear component).

I wonder if I have misunderstood what this is supposed to do or if I am coding this incorrectly somehow. Since the doc example works fine I do not suppose it is a bug in mgcv.

How else can I test the linear trend separately from the nonlinear component in a GAMM?

This is the relevant example from the mgcv documentation:

require(mgcv)
n <- 100
set.seed(2)
x <- runif(n)
y <- x + x^2*.2 + rnorm(n) *.1
ed <- data.frame(y,x)
mg <- gam(y~s(x,m=c(2,0))+x,data=ed,method="REML")

As I understand it, this removes the linear component from the smooth so that it can be separately estimated.

Here are the slope estimates from gam and lm, pretty close:

> summary(mg)

Family: gaussian 
Link function: identity 

Formula:
y ~ s(x, m = c(2, 0)) + x

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.02133    0.05315  -0.401    0.689    
x            1.18249    0.10564  11.193   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
        edf Ref.df     F p-value  
s(x) 0.9334      8 0.304   0.076 .
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =   0.91   Deviance explained = 91.1%
-REML = -70.567  Scale est. = 0.012767  n = 100

> summary(lm(y~x,ed))


Call:
lm(formula = y ~ x, data = ed)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.24082 -0.07942 -0.01136  0.09430  0.22836 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.03019    0.02212  -1.365    0.175    
x            1.20052    0.03851  31.175   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1144 on 98 degrees of freedom
Multiple R-squared:  0.9084,    Adjusted R-squared:  0.9075 
F-statistic: 971.9 on 1 and 98 DF,  p-value: < 2.2e-16

Here is the graph of the data with the superimposed predictions from lm (red) and gam (blue):

plot(y~x,ed)
abline(lm(y~x,ed),col="red")
points(ed$x,predict(mg,ed),col="blue")

plot with data and model predictions from lm and gam

And here is the smooth, which looks perfectly detrended and very small, consistent with the graph:

plot.gam(mg)

plot of x smooth

Here is the same analysis using my data.

read.table("ev.txt") -> ev
me <- gam(logart~s(item,m=c(2,0))+item,data=ev,method="REML")

The linear slope for item estimated by gam is negative (-0.11) although the actual slope is positive (+0.32):

> summary(me)

Family: gaussian 
Link function: identity 

Formula:
logart ~ s(item, m = c(2, 0)) + item

Parametric coefficients:
             Estimate Std. Error  t value Pr(>|t|)    
(Intercept)  6.305740   0.003587 1758.100   <2e-16 ***
item        -0.112921   0.049174   -2.296   0.0217 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
          edf Ref.df     F  p-value    
s(item) 6.125      8 3.583 3.15e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  0.0298   Deviance explained = 3.14%
-REML = -842.54  Scale est. = 0.039768  n = 4453


> summary(lm(logart~item,ev))

Call:
lm(formula = logart ~ item, data = ev)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.65009 -0.12830  0.00627  0.13301  0.61885 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 6.311575   0.003001 2103.47   <2e-16 ***
item        0.031751   0.003048   10.42   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2001 on 4451 degrees of freedom
Multiple R-squared:  0.0238,    Adjusted R-squared:  0.02359 
F-statistic: 108.5 on 1 and 4451 DF,  p-value: < 2.2e-16

Here is the plot of the data along with the predictions from lm and gam, both of which look perfectly reasonable:

plot(logart~item,ev,col="#20202020")
abline(lm(logart~item,ev),col="red")
points(ev$item,predict(me,ev),col="blue")

plot of my data with lm and gam predictions

So the overall model seems fine except that the smooth is not in fact detrended, and seems to be doing most of the work (note the y axis range):

plot.gam(me)

plot of smooth

(The results are the same with bam, and explicitly specifying bs='tp' makes no difference, as expected.)

Is this even possible? How could different datasets lead to so different results in terms of the linear component in gam/bam (mis)matching the lm/lmer estimate? Have I misunderstood the point in (or the process for) testing for the nonlinearity separately? How else should/could I do it?


Solution

  • The following answer was sent by Simon Wood by e-mail:

    The smoothing basis is not orthogonalized to the linear trend, so you would not expect the slope estimates to match those from lm. The point of this construction is really to enable testing of whether anything is needed beyond a linear trend, i.e. whether the smooth is needed rather than a simple linear effect.

    I can't say I understand what exactly the linear slope represents in this model, but it sounds like the model is not supposed to produce a separation into a linear and a detrended nonlinear component as I had falsely assumed. To do that, Wood suggested one could

    clone the smooth constructor and add the code to orthogonalize the basis