rggplot2facet

How to position geom_text labels neatly when using stat_summary() + after_stat() to mark median/mean over a histogram


Consider the following plot:

library(dplyr, warn.conflicts = FALSE)
library(ggplot2)

p <- 
  diamonds |> 
  filter(color %in% c("D", "E", "F") & cut %in% c("Very Good", "Premium")) |> 
  ggplot(aes(x = depth)) +
  geom_histogram() +
  facet_grid(cut ~ color)

p
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Created on 2024-05-16 with reprex v2.0.2

Say that I want to mark the mean and median over each facet. I want to calculate it using after_stat() because it's flexible for my needs. Based on this answer, I can do the following:


p +
  stat_summary(
    aes(
      label = after_stat(paste0("Mean = ", round(x, 2))),
      y = 0
    ),
    fun = mean,
    geom = "text",
    orientation = "y",
    color = "red",
    hjust = 1,
    vjust = -2 * 2,
    position = position_nudge(x = .1)
  ) +
  stat_summary(
    aes(
      label = after_stat(paste0("Median = ", round(x, 2))),
      y = 0
    ),
    fun = median,
    geom = "text",
    orientation = "y",
    color = "blue",
    hjust = 1,
    vjust = -4 * 2,
    position = position_nudge(x = .1)
  ) 
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Created on 2024-05-16 with reprex v2.0.2

The Problem

The labels are off. I want to align them nicely relative to one another. Can I set the positioning relative to the left-top border(s)?

Expected output

Demo


Solution

  • One "freaky" option would be to use just one stat_summary to compute both the mean and the median in one go which allows to add them as one label which could e.g. nicely placed in the top left corner of the panels. To make this work I use stage() to set the x and y positions to -Inf and Inf after the stat has been applied. To still get different colors I switched to ggtext::GeomRichText which allows to style the labels using some HTML and CSS.

    library(ggplot2)
    
    p +
      stat_summary(
        aes(
          label = after_stat({
            means <- vapply(x, `[[`, "mean", FUN.VALUE = numeric(1))
            medians <- vapply(x, `[[`, "median", FUN.VALUE = numeric(1))
            paste(
              paste0("<span style='color: red'>Mean = ", round(means, 2), "</span>"),
              paste0("<span style='color: blue'>Median = ", round(medians, 2), "</span>"),
              sep = "<br>"
            )
          }),
          x = stage(depth, after_stat = -Inf),
          y = stage(0, after_stat = Inf)
        ),
        fun = \(x) {
          list(
            list(mean = mean(x), median = median(x))
          )
        },
        geom = ggtext::GeomRichtext,
        orientation = "y",
        hjust = 0,
        vjust = 1,
        fill = NA,
        label.size = 0
      )
    #> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.