rmgcvgratia

combining linear and random effect smooth in mgcv for plotting


Consider a simple GAM (using mgcv) with one linear smooth and one random intercept term like this:

gam(y ~ s(x1) + s(x2, bs="re"), method = 'REML', 
    family = 'gaussian', data = my.data) 

I know I can use gratia::smooth_estimates() to get nicely spaced predictions for x1, and the random effects for each level of x2, all in one data frame. Is there a neat way of producing the smooth estimates for x1 for each level of x2 all in one place for plotting, i.e. taking each value for x1 and adding on each value of x2 to give n(x2) smooths which stack on each other when plotted.


Solution

  • That would be prediction depending on whether you want to include or exclude the model constant term (and from the example I would argue it doesn't matter if you include the intercept).

    If so, you can use the fitted_values() function in {gratia} to generate the values you want by providing it with a suitable data slice:

    library("gratia")
    library("mgcv")
    m <- gam(y ~ s(x1) + s(x2, bs = "re"), method = "REML", 
             family = gaussian(), data = my.data) 
    ds <- data_slice(m, x1 = evenly(x1, n = 100), x2 = evenly(x2))
    fv <- fitted_values(m, data = ds, scale = "response")
    

    If you really want to exclude the constant term, then you can exclude or select any combination of terms in the model, including parametric terms using the exclude or terms arguments respectively to predict.gam() (which fitted_values() passes on for you). You do have to be careful to spell the names of the terms correctly. If in doubt, run summary() on your model and note how mgcv refers to the terms.

    For example, to remove the intercept, we modify the above fitted_values() call to

    fv <- fitted_values(m, data = ds, scale = "response", exclude = "(Intercept)")
    

    Using exclude (or terms) what can produce estimated values for any additive combination of model terms for the covariate values supplied by to the function. fitted_values() is just a nice wrapper around predict.gam(), which also produces the appropriate credible interval for you - if you ask for scale = "response", it fits on the link scale and then back transforms the estimates and their uncertainty band to the response scale using the inverse of the link function.