I want to make a series of histograms with the same data that facet wraps the histogram with different values for the number of bins. Using the palmerpenguins::penguin
dataset as an example:
library(palmerpenguins)
library(data.table)
library(ggplot2)
dt <- data.table(palmerpenguins::penguins)
ggplot(dt, aes(x = body_mass_g)) +
geom_histogram(bins = 3)
ggplot(dt, aes(x = body_mass_g)) +
geom_histogram(bins = 5)
which produce the expected plots:
What I want to do is facet wrap a series of these where the number of bins increments or is controlled by some variable. I think I've seen this done before with a single instance of ggplot() + geom_histogram() + facet_wrap()
where the facet_wrap()
has a parameter with some sequence, but I'm not sure if I'm misremembering.
We can use hist()
seq
uences with specific length.out
as breaks in an lapply
. Setting plot=FALSE
stores the statistics (i.e. breaks and counts) will store the results in a list. Avoid using graphical stuff such as main
, xlim
, xlab
, etc. which would throw a warning. We can set graphical parameters later when we plot
it.
> breaks <- 2:10
> p <- lapply(breaks,
+ \(i) with(dt,
+ hist(body_mass_g,
+ breaks=do.call('seq.int',
+ c(as.list(range(body_mass_g, na.rm=TRUE)),
+ list(length.out=i + 1L))),
+ plot=FALSE)
+ )
+ )
To plot the histograms, we simply use plot
in Map
where we can set all the graphical stuff individually for each histogram.
> par(mfrow=c(3, 3))
> Map(plot, p, main=sprintf('%s bins', seq_along(p) + 1L)) |> invisible()
Note that this does not store the plots in a list, but only the statistics so we can plot them later. For ideas how to store base graphics in objects, consider ?recordPlot
or these answers. However I think having such a feature is overrated.
Data:
> dt <- data.table::data.table(palmerpenguins::penguins)