I have a data set of two different measurements which have very different magnitudes in their value (let's call them height and width) measured in different samples (a and b). I want to illustrate the variability of these two measurements regardless of their absolute magnitude using a smoothed histogram, so I 've been using geom_density
. However, the density values calculated using geom_density
are likewise orders of magnitude different from one another.
The dataset:
library(tidyverse)
set.seed(123)
sample_a = tibble(sample = "a",
height = rnorm(20, mean = 0.1, sd = 0.01),
width = rnorm(20, mean = 50, sd = 10)) %>%
pivot_longer(c(height, width), names_to = "parameter", values_to = "result")
set.seed(321)
sample_b = tibble(sample = "b",
height = rnorm(20, mean = 0.2, sd = 0.03),
width = rnorm(20, mean = 55, sd = 10)) %>%
pivot_longer(c(height, width), names_to = "parameter", values_to = "result")
data = bind_rows(sample_a, sample_b)
When I plot the histograms, I am able to compare the magnitude and variability for each parameter of interest in each sample when sample count is plotted on the y axis:
data %>%
ggplot()+
geom_histogram(aes(x = result))+
facet_grid(sample~parameter, scales = "free_x")
However, when I use geom_density
(even incorporating y = after_stat(count)
, as suggested in this answer: Normalizing y-axis in histograms in R ggplot to proportion), the magnitudes are substantially different:
data %>%
ggplot()+
geom_density(aes(x = result, y = stat(count)))+
facet_grid(sample~parameter, scales = "free_x")
How would I show these different magnitudes of results in a faceted plot using smoothed histograms?
We can use after_stat(scaled)
:
ggplot(data, aes(x = result)) +
geom_density(aes(y = after_stat(scaled))) +
facet_grid(sample~parameter, scales = "free_x")
To illustrate better, I showed the histogram and density plots on the same graph:
ggplot(data, aes(x = result)) +
geom_histogram(aes(y = after_stat(count)), colour = "black", fill = NA) +
geom_density(aes(y = after_stat(scaled))) +
facet_grid(sample~parameter, scales = "free_x")
Or even better, use after_stat(ncount)
to normalize the counts and match the histogram with the density plot:
ggplot(data, aes(x = result)) +
geom_histogram(aes(y = after_stat(ncount)), colour = "black", fill = NA) +
geom_density(aes(y = after_stat(scaled))) +
facet_grid(sample~parameter, scales = "free_x")
Created on 2023-11-01 with reprex v2.0.2