rggplot2

How to create a ggplot stat to calculate and display the range of grouped data: dropped y aesthetic


I'd like to visualize the total range of grouped data, as another layer on top of a visualization like geom_jitter(). I'm seeking a flexible approach that avoids the need to pre-calculate summary stats into a separate summary data object, so that it's easily reusable across visualizations where the names of visualized variables change. I'd like it to have an interfaces as simpleas geom_boxplot() for example.

It seems like the way to do so might be to implement a new "custom" ggplot stat to do the data transformation, as taught in the ggplot2 book.

I'm not too concerend at this point about how exactly this is visualized. A floating bar from min to max would be great, but I'm happy to settle for something like the single vertical line of geom_linerange or the slightly more prominent capped lines of geom_errorbar.

Since geom_linerange() and geom_errorbar() seem to need only the ymin and ymax aesthetics, I would think that's all the custom stat needs to calculate. I've implemented such stat below, and it produces the expected plot results, but it also gives a warning about the y aesthetic being dropped during statistical transformation.

What am I missing?

library(ggplot2)
library(dplyr)

StatMinMax <- ggproto("StatMinMax", Stat,
  compute_group = function(data, scales) {
    data %>%
      summarise(
        ymin = min(y),
        ymax = max(y)
      )
  }
)

stat_minmax <- function(mapping = NULL, data = NULL, geom = "errorbar",
                       position = "identity", na.rm = FALSE, show.legend = NA,
                       inherit.aes = TRUE, ...) {
  layer(
    stat = StatMinMax, mapping = mapping, data = data, geom = geom,
    position = position, show.legend = show.legend, inherit.aes = inherit.aes,
    params=list(na.rm = na.rm, ...)
  )
}

diamonds %>%
  ggplot(aes(x=clarity, y=price)) +
  stat_minmax() +
  geom_point()
#> Warning: The following aesthetics were dropped during statistical transformation: y.
#> ℹ This can happen when ggplot fails to infer the correct grouping structure in
#>   the data.
#> ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
#>   variable into a factor?

a plot showing expected results



mtcars %>%
  mutate(cyl=factor(cyl)) %>%
  ggplot(aes(x=cyl, y=mpg)) +
  stat_minmax() +
  geom_point()
#> Warning: The following aesthetics were dropped during statistical transformation: y.
#> ℹ This can happen when ggplot fails to infer the correct grouping structure in
#>   the data.
#> ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
#>   variable into a factor?

another plot with expected results

Created on 2025-06-06 with reprex v2.1.1


Solution

  • You can avoid the warning by explicitly specifying which aesthetics get dropped during the statistical transformation via the dropped_aes field.

    This is due to this change which

    As a consequence of this change, stats now need to advertise which aesthetics they are dropping

    library(ggplot2)
    library(dplyr, warn = FALSE)
    
    StatMinMax <- ggproto("StatMinMax", Stat,
      dropped_aes = c("y"),
      compute_group = function(data, scales) {
        data %>%
          summarise(
            ymin = min(y),
            ymax = max(y)
          )
      }
    )
    
    stat_minmax <- function(mapping = NULL, data = NULL, geom = "errorbar",
                            position = "identity", na.rm = FALSE, show.legend = NA,
                            inherit.aes = TRUE, ...) {
      layer(
        stat = StatMinMax, mapping = mapping, data = data, geom = geom,
        position = position, show.legend = show.legend, inherit.aes = inherit.aes,
        params = list(na.rm = na.rm, ...)
      )
    }
    
    diamonds %>%
      ggplot(aes(x = clarity, y = price)) +
      stat_minmax() +
      geom_point()