rregressiongammgcvgratia

How to plot logistic probability for custom generalized additive model (GAM) plot?


I realize that there is already a question pertaining to this here. However, I'm not looking to use the draw function and would prefer to build up the plot by scratch like shown here. However, the plot shown in the link doesn't show how to do this with logistic probability. I show an example that gets close, but only plots by the link function.

#### Libraries and Data ####
library(mgcv)
library(gamair)
library(tidyverse)
data("wesdr")

#### Fit Model ####
fit <- gam(
  ret ~ s(dur),
  method = "REML",
  family = binomial,
  data = wesdr
)

#### Evaluate the Smooths ####
sm <- smooth_estimates(fit) %>%
  add_confint()
sm

#### Add Partial Residuals ####
wesdr <- wesdr %>% 
  add_partial_residuals(fit)

#### Plot ####
p <- sm %>%
  filter(smooth == "s(dur)") %>%
  ggplot() +
  geom_rug(aes(x = dur),
           data = wesdr,
           sides = "b", 
           length = grid::unit(0.02, "npc")) +
  geom_ribbon(aes(ymin = lower_ci, 
                  ymax = upper_ci,
                  x = dur),
              alpha = 0.2) +
  geom_line(aes(x = dur,
                y = est),
            lwd = 1.2) +
  labs(y = "Partial effect",
       title = "s(dur)")
p

enter image description here

Specifically, I'm looking for something functionally equivalent to this:

plot(fit,
     trans = plogis,
     shift = coef(fit)[1])

enter image description here

Any advice on how to get the probability plot?

Edit

I realize I wasn't precise about what I needed. The answer given is good, but I'm considering the most general-case use where a logistic GAM has multiple predictors. So I need a plot based off a model like this:

#### Fit Model ####
fit <- gam(
  ret 
  ~ s(dur)
  + s(bmi),
  method = "REML",
  family = binomial,
  data = wesdr
)

To account for the average value of other predictors, I need to include the intercept into the plot, which is why I originally used shift in the plot.gam function in base R.


Solution

  • The general way to do this is to predict from the model at the values you want. Using your extended example,

    library("gratia")
    library("mgcv")
    data("wesdr", package = "gamair")
    fit <- gam(ret ~ s(dur) + s(bmi),
               data = wesdr, method = "REML", family = binomial)
    

    Create a data slice at the values of the covariates you want. If you just specify dur in the data slice then you will get the other covariate, bmi set to the value of the observation closest to the median of bmi in the training data

    ds1 <- data_slice(fit, dur = evenly(dur, n = 100))
    

    But it is easy to specify other values; you mentioned setting the other covariates to their mean:

    ds2 <- data_slice(fit, dur = evenly(dur, n = 100), bmi = mean(bmi))
    

    Then you predict():

    fv2 <- fitted_values(fit, data = ds2, scale = "response")
    

    Then plot

    library("ggplot2")
    
    fv2 |>
      ggplot(aes(x = dur, y = fitted)) +
      geom_ribbon(aes(x = dur, ymin = lower, ymax = upper),
                  inherit.aes = FALSE, alpha = 0.2) +
      geom_line() +
      geom_rug(data = wesdr, aes(x = dur), sides = "b", inherit.aes = FALSE, 
               length = grid::unit(0.01, "npc"), alpha = 0.5)
    

    which produces

    enter image description here