rggplot2label

Can R display the calculated boxplot values in ggplot2 as labels?


As the title says, can ggplot2 display the calculated boxplot values as labels? I would like the max, upper hinge, median, lower hinge, and min values displayed.

With the simple example of:

ggplot()+
  geom_boxplot(data = mtcars,
               aes(group = cyl,
                   x = gear, y = mpg))

boxplot of mtcars

I understand that I could calculate these values with something simple like the following and call the calculated values to get the following graph, but my data is much more complex and difficult to calculate the values as such.

mtcarsProbs <- mtcars %>%
  group_by(cyl) %>%
  summarise(min = quantile(mpg, probs = 0),
            `25th` = quantile(mpg, probs = 0.25),
            median = quantile(mpg, probs = 0.5),
            `75th` = quantile(mpg, probs = 0.75),
            max = quantile(mpg, probs = 1))

ggplot()+
  geom_boxplot(data = mtcars,
               aes(group = cyl,
                   x = cyl, y = mpg))+
  geom_text(data = mtcarsProbs,
            aes(x = cyl, y = max+1,
                label = max))+
  geom_text(data = mtcarsProbs,
            aes(x = cyl, y = `75th`-0.5,
                label = `75th`))+
  geom_text(data = mtcarsProbs,
            aes(x = cyl, y = median+0.5,
                label = median))+
  geom_text(data = mtcarsProbs,
            aes(x = cyl, y = `25th`+0.5,
                label = `25th`))+
  geom_text(data = mtcarsProbs,
            aes(x = cyl, y = min-1,
                label = min))

mtcars boxplot with labels

Edit: Is there a way to have ggplot2 calculate and display these values automatically?


Solution

  • First create the plot without text and store it.

    library(tidyverse)
    
    p <- ggplot(mtcars) +
      geom_boxplot(aes(group = cyl, x = gear, y = mpg), fill = "gray92") 
    

    You can then get this plot's layer_data and pivot it into long format. The resulting data frame will contain all the correct x and y positions for a text layer which you can easily add back in to the original plot.

    You'll probably want to add some vjust values so that the text sits nicely relative to the boxes rather than adding arbitrary values of y to move them around, since this will not behave as well when scaling the plot or adapting the code to new data.

    p +
      geom_text(data = pivot_longer(layer_data(p), 1:5),
                aes(x = x, y = value, label = value), 
                vjust = rep(c(1.5, -0.5, -0.5, 1.5, -0.5), 3)) +
      theme_minimal(base_size = 20)
    

    enter image description here