rggplot2annotations

Add the percentage of different types of observations in a geom_count plot in ggplot2


A MWE would be:

df <- tibble(ID = c(1:12), 
             Var1 = c(A, B, C, C, C, A, B, B, C, A, B, C),
             Var2 = c(B, B, A, A, C, C, A, A, B, C, A, C)
)

# draw the following plot showing the frequency of each type of observation
# with dot size.
ggplot(df, aes(x = factor(Var1, level=c('A', 'B', 'C')), 
                                  y = factor(Var2, level=c('A', 'B', 'C')) )) + 
   geom_count(show.legend = F) + 
   labs(x = "Var1", y = "Var2") + 
   theme_bw()

The resulted figure looks like this: Resulted Picture

What I want to do is to add a number beside each dot in the figure showing the percentage of the observation of that dot.

I can calculate the percentage and mannually add annotations for each dot. However, I have five treatments like this, and each treatment has 9 percentage numbers to add in the plot. Adding them one by one mannually would be very tedious. Is there an easy way out? I believe there is one since R is very powerful!

Thanks a lot for all your help.


Solution

  • Using a geom_text with stat="sum" and after_stat() you can add a label of the proportions like so where n is the name of the column containing the counts computed by stat="sum" under the hood:

    set.seed(123)
    
    df <- data.frame(
      ID = c(1:12),
      Var1 = sample(LETTERS[1:3], 12, replace = TRUE),
      Var2 = sample(LETTERS[1:3], 12, replace = TRUE)
    )
    
    library(ggplot2)
    
    ggplot(df, aes(
      x = factor(Var1, level = c("A", "B", "C")),
      y = factor(Var2, level = c("A", "B", "C"))
    )) +
      geom_count(show.legend = FALSE) +
      geom_text(
        aes(
          label = scales::percent(after_stat(n / sum(n)))
        ),
        size = 10 / .pt,
        stat = "sum",
        vjust = 0,
        nudge_y = .2
      ) +
      labs(x = "Var1", y = "Var2") +
      theme_bw()