rggplot2facet-wrap

Facet_wrap a histogram with different values for binwidth


I want to make a series of histograms with the same data that facet wraps the histogram with different values for the number of bins. Using the palmerpenguins::penguin dataset as an example:

library(palmerpenguins)
library(data.table)
library(ggplot2)
dt <- data.table(palmerpenguins::penguins)

ggplot(dt, aes(x = body_mass_g)) +
geom_histogram(bins = 3)

ggplot(dt, aes(x = body_mass_g)) +
geom_histogram(bins = 5) 

which produce the expected plots:

Palmer penguin body mass histogram with 3 bins

Palmer penguin body mass histogram with 5 bins

What I want to do is facet wrap a series of these where the number of bins increments or is controlled by some variable. I think I've seen this done before with a single instance of ggplot() + geom_histogram() + facet_wrap() where the facet_wrap() has a parameter with some sequence, but I'm not sure if I'm misremembering.


Solution

  • We can use hist() sequences with specific length.out as breaks in an lapply. Setting plot=FALSE stores the statistics (i.e. breaks and counts) will store the results in a list. Avoid using graphical stuff such as main, xlim, xlab, etc. which would throw a warning. We can set graphical parameters later when we plot it.

    > breaks <- 2:10
    > p <- lapply(breaks, 
    +             \(i) with(dt, 
    +                       hist(body_mass_g, 
    +                            breaks=do.call('seq.int', 
    +                                           c(as.list(range(body_mass_g, na.rm=TRUE)), 
    +                                             list(length.out=i + 1L))),
    +                            plot=FALSE)
    +             )
    + )
    

    To plot the histograms, we simply use plot in Map where we can set all the graphical stuff individually for each histogram.

    > par(mfrow=c(3, 3))
    > Map(plot, p, main=sprintf('%s bins', seq_along(p) + 1L)) |> invisible()
    

    enter image description here

    Note that this does not store the plots in a list, but only the statistics so we can plot them later. For ideas how to store base graphics in objects, consider ?recordPlot or these answers. However I think having such a feature is overrated.


    Data:

    > dt <- data.table::data.table(palmerpenguins::penguins)