rggplot2kernel-densityviolin-plot

Separating groups in plot when using gradient fill


I'm using stat_ydensity (co-opted from the violin plot) to shade bar density using color rather than size.

This code does what I need, when using a single category:

library(tidyverse)
set.seed(100)
n <- 100
df <- data.frame(a=rnorm(n, mean=5, sd=1), b=rnorm(n, mean=10, sd=2), c=rnorm(n, mean=15, sd=3)) %>%
  pivot_longer(everything(), names_to = "group", values_to = "value") %>%
  mutate(group = factor(group),
         type = as.character(rbinom(seq(nrow(.)), 1, 0.5)))

ggplot() + 
  stat_ydensity(data = df,
                aes(x = group,
                    y = value,
                    fill = after_stat(density), 
                    xmin = stat(x) - 0.2, xmax = stat(x) + 0.2,
                    # Nudge y by a bit depending on the spread of your data
                    ymin = stat(y) - 0.05, ymax = stat(y) + 0.05), 
                geom = "rect", trim = TRUE)

enter image description here

However, I'd like to be able to compare distributions in two different categories. Normally, this would be done using aes(fill) or aes(color):

ggplot() + 
  stat_ydensity(data = df,
                aes(x = group,
                    y = value,
                    fill = type,
                    xmin = stat(x) - 0.2, xmax = stat(x) + 0.2,
                    ymin = stat(y) - 0.05, ymax = stat(y) + 0.05), 
                geom = "rect", trim = TRUE)

But this isn't compatible with the using fill to do density. Is there any way to replicate the first plot, but while having two groups in separate colors?


Solution

  • Here is a sketch of a possible solution using the package ggnewscale.

    
    library(tidyverse)
    library(ggnewscale)
    set.seed(100)
    n <- 200
    df <- data.frame(a=rnorm(n, mean=5, sd=1), b=rnorm(n, mean=10, sd=2), 
                     c=rnorm(n, mean=15, sd=3)) %>%
      pivot_longer(everything(), names_to = "group", values_to = "value") %>%
      mutate(group = factor(group),
             type = as.character(rbinom(seq(nrow(.)), 1, 0.5)))
    
    
    
    ggplot() + 
      stat_ydensity(data = filter(df, type == 0),
                    aes(x = as.numeric(group) - 0.2,
                        y = value,
                        group = interaction(group,type),
                        fill = after_stat(density), 
                        xmin = stat(x) - 0.2, xmax = stat(x) + 0.2,
                        # Nudge y by a bit depending on the spread of your data
                        ymin = stat(y) - 0.05, ymax = stat(y) + 0.05), 
                    inherit.aes = FALSE,
                    geom = "rect", trim = TRUE) +
      scale_fill_continuous(name = "type 0") +
      new_scale_fill() +
      stat_ydensity(data = filter(df, type ==1),
                    aes(x = as.numeric(group) +0.2,
                        y = value,
                        group = interaction(group,type),
                        fill = after_stat(density), 
                        xmin = stat(x) - 0.2, xmax = stat(x) + 0.2,
                        # Nudge y by a bit depending on the spread of your data
                        ymin = stat(y) - 0.05, ymax = stat(y) + 0.05), 
                    inherit.aes = FALSE,
                    geom = "rect", trim = TRUE) +
      scale_fill_gradient(low = 'blue', high = 'red', name = "type 1") +
      scale_x_continuous(breaks = c(1,2,3), labels = c("a","b","c"),name = 'group')
    #> Warning: `stat(x)` was deprecated in ggplot2 3.4.0.
    #> ℹ Please use `after_stat(x)` instead.
    #> This warning is displayed once every 8 hours.
    #> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
    #> generated.
    
    <sup>Created on 2025-06-15 with [reprex v2.1.1](https://reprex.tidyverse.org)</sup>
    
    

    enter image description here