rggplot2labelquantile-regressionggpmisc

Is there a neat approach to label a ggplot plot with the equation and other statistics from geom_quantile()?


I'd like to include the relevant statistics from a geom_quantile() fitted line in a similar way to how I would for a geom_smooth(method="lm") fitted linear regression (where I've previously used ggpmisc which is awesome). For example, this code:

# quantile regression example with ggpmisc equation
# basic quantile code from here:
# https://ggplot2.tidyverse.org/reference/geom_quantile.html

library(tidyverse)
library(ggpmisc)
# see ggpmisc vignette for stat_poly_eq() code below:
# https://cran.r-project.org/web/packages/ggpmisc/vignettes/user-guide.html#stat_poly_eq

my_formula <- y ~ x
#my_formula <- y ~ poly(x, 3, raw = TRUE)

# linear ols regression with equation labelled
m <- ggplot(mpg, aes(displ, 1 / hwy)) +
  geom_point()

m + 
  geom_smooth(method = "lm", formula = my_formula) +
  stat_poly_eq(aes(label =  paste(stat(eq.label), "*\" with \"*", 
                                  stat(rr.label), "*\", \"*", 
                                  stat(f.value.label), "*\", and \"*",
                                  stat(p.value.label), "*\".\"",
                                  sep = "")),
               formula = my_formula, parse = TRUE, size = 3)  

generates this: ggplot with linear ols equation

For a quantile regression, you can swap out geom_smooth() for geom_quantile() and get a lovely quantile regression line plotted (in this case the median):

# quantile regression - no equation labelling
m + 
  geom_quantile(quantiles = 0.5)
  

geom_quantile plot

How would you get the summary statistics out to a label, or recreate them on the go? (i.e. other than doing the regression prior to the call to ggplot and then passing it in to then annotate (e.g. similar to what was done here or here for a linear regression?


Solution

  • Package 'ggpmisc' (>= 0.4.5) allows a much simpler answer, which is closer to the solution hoped for by @MarkNeal in his question about median regression. This answer should be preferred to earlier ones when using a recent version of 'ggpmisc'. Not shown: passing se = FALSE to stat_quant_line() disables the confidence band.

    library(ggplot2)
    library(ggpmisc)
    #> Loading required package: ggpp
    #> 
    #> Attaching package: 'ggpp'
    #> The following object is masked from 'package:ggplot2':
    #> 
    #>     annotate
    
    m <- ggplot(mpg, aes(displ, 1 / hwy)) +
      geom_point()
    
    m + 
      stat_quant_line(quantiles = 0.5) +
      stat_quant_eq(aes(label =  paste(after_stat(eq.label), "*\" with \"*", 
                                       after_stat(rho.label), "*\", \"*", 
                                       after_stat(n.label), "*\".\"",
                                       sep = "")),
                    quantiles = 0.5,
                    size = 3)  
    #> Warning in rq.fit.br(x, y, tau = tau, ci = TRUE, ...): Solution may be nonunique
    

    Created on 2022-06-03 by the reprex package (v2.0.1)

    The default is to plot the median and quartiles.

    m + 
      stat_quant_line() +
      stat_quant_eq(aes(label =  paste(after_stat(eq.label), "*\" with \"*", 
                                       after_stat(rho.label), "*\", \"*", 
                                       after_stat(n.label), "*\".\"",
                                       sep = "")),
                    size = 3)  
    #> Warning in rq.fit.br(x, y, tau = tau, ci = TRUE, ...): Solution may be nonunique
    

    Created on 2022-06-03 by the reprex package (v2.0.1)

    We can also map the quantiles to color and linetype aesthetics easily.

    m + 
      stat_quant_line(aes(linetype = after_stat(quantile.f),
                      color = after_stat(quantile.f))) +
      stat_quant_eq(aes(label =  paste(after_stat(eq.label), "*\" with \"*", 
                                       after_stat(rho.label), "*\", \"*", 
                                       after_stat(n.label), "*\".\"",
                                       sep = ""),
                        color = after_stat(quantile.f)),
                    size = 3)  
    #> Warning in rq.fit.br(x, y, tau = tau, ci = TRUE, ...): Solution may be nonunique
    

    Created on 2022-06-03 by the reprex package (v2.0.1)

    We can also plot the quartiles as a band by using stat_quant_band() instead of stat_quant_line().

    m + 
      stat_quant_band() +
      stat_quant_eq(aes(label =  paste(after_stat(eq.label), "*\" with \"*", 
                                       after_stat(rho.label), "*\", \"*", 
                                       after_stat(n.label), "*\".\"",
                                       sep = "")),
                    size = 3)  
    #> Warning in rq.fit.br(x, y, tau = tau, ci = TRUE, ...): Solution may be nonunique
    

    Created on 2022-06-03 by the reprex package (v2.0.1)