rggplot2significance

Adding significance levels (for percentage differences) to a multilevel dodge ggplot2


I am trying to use ggsignif package, and reproduce the answer given by const-ae, here, with the following data:

library(ggplot2)
library(ggsignif)
counts <- structure(list(ECOST = c("0.52", "0.52", "0.39", "0.39", "0.26", 
"0.26", "0.13", "0.13", "0.00", "0.00"), group = c("control", 
"treatment", "control", "treatment", "control", "treatment", 
"control", "treatment", "control", "treatment"), count = c(18, 
31, 30, 35, 47, 46, 66, 68, 86, 86), percentage = c(16.3636363636364, 
31.9587628865979, 27.2727272727273, 36.0824742268041, 42.7272727272727, 
47.4226804123711, 60, 70.1030927835051, 78.1818181818182, 88.659793814433
), total = c(110, 97, 110, 97, 110, 97, 110, 97, 110, 97), negative_count = c(92, 
66, 80, 62, 63, 51, 44, 29, 24, 11), p_value = c(0.00843644912924255, 
0.00843644912924255, 0.172947686684261, 0.172947686684261, 0.497952719783453, 
0.497952719783453, 0.128982570547408, 0.128982570547408, 0.0447500820026408, 
0.0447500820026408)), row.names = c(NA, -10L), class = c("data.table", 
"data.frame"))

counts %>% 
    ggplot(aes(x = ECOST, y = percentage, fill = group, label=sprintf("%.02f %%", round(percentage, digits = 1)))) + 
    geom_col(position = 'dodge') + 
    geom_text(position = position_dodge(width = .9),    # move to center of bars
              vjust = -0.5,    # nudge above top of bar
              size = 4) +           
    scale_fill_grey(start = 0.8, end = 0.5) +
    theme_bw(base_size = 15)

Data:

    ECOST     group count percentage total negative_count p_value
 1:  0.52   control    18         16   110             92  0.0084
 2:  0.52 treatment    31         32    97             66  0.0084
 3:  0.39   control    30         27   110             80  0.1729
 4:  0.39 treatment    35         36    97             62  0.1729
 5:  0.26   control    47         43   110             63  0.4980
 6:  0.26 treatment    46         47    97             51  0.4980
 7:  0.13   control    66         60   110             44  0.1290
 8:  0.13 treatment    68         70    97             29  0.1290
 9:  0.00   control    86         78   110             24  0.0448
10:  0.00 treatment    86         89    97             11  0.0448

enter image description here

So the next step would be to add the significance levels to the plot. I have been messing around with it a little bit, but I do not understand the code well enough to adapt it.I know I have to overwrite the plot data, but I cannot figure out how to do it.

counts %>% 
    ggplot(aes(x = ECOST, y = percentage, fill = group, label=sprintf("%.02f %%", round(percentage, digits = 1)))) + 
    geom_col(position = 'dodge') + 
    geom_text(position = position_dodge(width = .9),    # move to center of bars
              vjust = -0.5,    # nudge above top of bar
              size = 4) +           
    scale_fill_grey(start = 0.8, end = 0.5) +
    theme_bw(base_size = 15) +
    geom_signif(stat="identity",
              data=data.frame(x=c(0.875, 1.875), xend=c(1.125, 2.125),
                              y=c(5.8, 8.5), annotation=c("**", "NS")),
              aes(x=x,xend=xend, y=y, yend=y, annotation=annotation)) +
    geom_signif(comparisons=list(c("treatment", "control")), annotations="***",
              y_position = 9.3, tip_length = 0, vjust=0.4)

I would be super happy if someone could explain me how to do the control versus treatments, and perhaps one between two levels of ECOST, after which I am sure I could take it from there.

Desired result (taken from link):

enter image description here


Solution

  • I am not sure what 'significant level' you want added to the graph. However, this might help you understand how to added your own though.

    counts %>% 
      ggplot(aes(x = ECOST, y = percentage, fill = group)) + 
      geom_col(position = 'dodge')  +           
      scale_fill_grey(start = 0.8, end = 0.5) +
      theme_bw(base_size = 15) +
     ## between ESCOT levels
      geom_signif(annotation=c("**", "NS"), y_position = c(95, 75), xmin=c(1,2),
                  xmax=c(2,5)) +
      geom_signif(annotation=c("***", "*"), y_position = c(50, 40), xmin=c(3.75,4.75),
                  xmax=c(4.25,5.25)) ## Within ESCOT levels but between groups
    

    y_poistion is the the y value you want it plotted

    xmin is a vector of start the position to start the brackets, so in the example the ** bracket starts at the first group of bars and the ns bracket starts at the 2nd

    xmax is the same as xmin but where the brackets finish.

    You also don't need two geom_signif layers, I have just done them as two to show the difference in between groups and between levels.

     + geom_signif(annotation=c("**", "NS","***", "*"), y_position = c(95, 75,50, 40), xmin=c(1,2,3.75,4.75),
                xmax=c(2,5,4.25,5.25))