rmedianviolin-plot

Adding median and quartile range to multiple grouped violin plot in R


I use the following code in R to produce a violin plot for 5 variables ("CAP1-5") across 3 groups (BP, BPoff, HC). This code worked:

ggplot(data, aes(x = CAP, y = Value, fill = GROUP)) +
  geom_violin(scale = "width", trim = FALSE) +
  scale_fill_manual(values = c("BP" = "red", "BPoff" = "grey", "HC" = "white")) +
  xlab("CAP") +
  ylab("Value") +
  theme_minimal() +
  facet_wrap(~ CAP, scales = "free_x", nrow = 1)
(output attached). 

But I would like to insert median and quartile to each violin plot, and I manage to add it only in the middle one for each variable adding the lines of code below, how can I do this?

geom_point(data = summary_data, aes(x = CAP, y = median), shape = 23, size = 3, fill = "white") +
  geom_errorbar(data = summary_data, aes(x = CAP, ymin = lower, ymax = upper), width = 0.2, color = "black") +

Thank you so much! Best I manage to add median and quartile  only in the middle plot for each variable


Solution

  • There are a few options to add quartiles to your plot.

    first is to use the draw_quartiles parameter in geom_violin_plot

    library(tidyverse)
    
    df <- data.frame(group=rep(LETTERS[1:3], each=100),
                     var=rep(c("CAP1","CAP2","CAP3","CAP4","CAP5"), times=60),
                     value=rnorm(300))
    
    summ_df <- df %>% group_by(group, var) %>% summarize(median=median(value),
                                                         Lower=quantile(value, probs=0.25),
                                                         Upper=quantile(value, probs=0.75)) %>% 
      pivot_longer(cols = median:Upper, names_to = "quantile", values_to = "estimate")
    
    df %>% ggplot(aes(x=var, y=value, fill=group)) + 
      geom_violin(scale="width", trim=FALSE, draw_quantiles = c(0.25, 0.5, 0.75))
    

    enter image description here

    second, is to use geom_point with a dataframe containing the summary estimates and position_dodge to get them spaced appropriately

    df %>% ggplot(aes(x=var, y=value, fill=group)) + 
      geom_violin(scale="width", trim=FALSE) + 
      geom_point(data = summ_df, aes(x=var, y=estimate, group=group), 
                 position = position_dodge(width=0.9))
    

    enter image description here

    third is to use stat_summary and Hmisc

    stat_sum_df <- function(fun, geom="crossbar", ...) {
      stat_summary(fun.data = fun, colour = "red", geom = geom, width = 0.2, ...)
    }
    
    df %>% ggplot(aes(x=var, y=value, fill=group)) + 
         geom_violin(scale="width", trim=FALSE) + 
      stat_sum_df("median_hilow", mapping = aes(group = group), position=position_dodge(width=0.9))
    

    enter image description here