rggplot2percentageggalluvial

Showing percentages of strata in alluvial plots


I have data on an intervention study with 4 experimental groups (Group), and three measurement points (Time). I conducted a profile and transition analysis, which yielded 4 profiles (Profile). I made an alluvial diagram showing the transitions from profiles between measurement points for each experimental group.

library(ggplot2)
library(ggalluvial)
library(dplyr)
library(tidyr)

df <- tibble(id = 1:400, group = sample(1:4, size = 400, replace = TRUE)) |>
crossing(time = 1:3) |>
mutate(profile = sample(1:4, size = n(), replace = TRUE))|>
mutate(profile = as.character(profile), time = as.character(time))


ggplot(df, aes(x = time, stratum = profile, alluvium = id, fill = profile, label = profile)) +
    scale_fill_brewer(type = "seq", palette = "BrBG") +
    geom_flow() +
    geom_stratum() +
    labs(x = "Time", y = "N") +
    facet_wrap('group', labeller = label_both)

Because the 4 groups do not have the same sample sizes, I would like to plot the relative stratum sizes instead of the absolute ones, so that all profiles for each time point amount to 100%. I don't know how to achieve this. Using the weight attribute didn't work.


Solution

  • As another approach you can introduce a weight manually and use it directly as y value + use scale_y_continuous(labels = scales::percent_format()) for marking the relative percentages

    out

    library(ggplot2)
    library(ggalluvial)
    library(dplyr)
    library(tidyr)
    
    set.seed(1) # set seed because we sample
    
    df <- tibble(id = 1:400, group = sample(1:4, size = 400, replace = TRUE)) |>
      crossing(time = 1:3) |>
      mutate(profile = sample(1:4, size = n(), replace = TRUE))|>
      mutate(profile = as.character(profile), time = as.character(time))|>
      mutate(total = n(), .by=c(group, time)) |>
      mutate(weight = 1/total)
    
    ggplot(df,
           aes(x = time,
               y = weight, # use weight here
               stratum = profile, 
               alluvium = id,
               fill = profile, 
               label = profile)) + 
      scale_fill_brewer(type = "seq", palette = "BrBG") +
      geom_flow(alpha = 0.6) +
      geom_stratum() +
      scale_y_continuous(labels = scales::percent_format()) +
      labs(x = "Time", y = "Percentage") +
      facet_wrap('group', labeller = label_both)