rggplot2tidyversebar-chart

Bar plots with identical bar width *and* identical bar spacing across *several* plots


Similarly to this question, I want to generate several and separate bar plots that always have the same bar width. That said, I also want that in each of these plots, the spacing between the bars is the same.

I also need these plots to be saved individually, and that each saved image size is proportional to the number of bars. I do not want to use facets or grids (which means similar questions on SO does not apply to me). Furthermore, I need a geom_bar() solution together with ggsave().

REPRODUCIBLE EXAMPLE

Libraries

library(tidyverse)
library(ggplot2)

Mock data (in reality, I have hundreds of variables and up to 15 levels for some factors)

df <- structure(list(Animal = structure(c(1L, 2L, 2L, 2L, 1L, 1L, 1L, 
                                          1L, 3L, 3L), levels = c("Cat", "Dog", "Horse"), class = "factor"), 
                     Sex = structure(c(2L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L), levels = c("Female", 
                                                                                           "Male"), class = "factor")), row.names = c(NA, 10L), class = "data.frame")

Function to generate bar plots

# Create counter
counter <- 0

# Function
make_bar_plots <- function(df){ 
  
  # We loop over the columns
  for (col in colnames(df)){ 
    
    # (1) Preparation ##########################################################
    # Store N
    N <- sum(!is.na(df[col]))
    # Store number of levels
    n_levels <- length(unique(df[!is.na(df[col]), col]))[1]
    # Update counter at each iteration
    counter <- counter + 1
    
    # (2) Basic bar plot #######################################################
    # We only plot factors
    if (is.factor(df[[col]])) {
      p <- ggplot(data = df[col] %>% filter(!is.na(df[col])),
                  aes(x=fct_rev(fct_infreq(!!sym(col)))))
      # Compute percents for bars
      p <- p + geom_bar(aes(y = (..count..)/sum(..count..)),
                        width = 0.35*n_levels/3, # We make column width proportional to number of levels!
                        fill = 'black',
                        size = 1.5
      ) 
      
      # (3) Annotations ########################################################
      
      # Add percent as annotation
      p <- p + geom_text(aes(label = scales::percent((..count..)/sum(..count..),
                                                     accuracy = 0.1L),
                             y = ((..count..)/sum(..count..))),
                         stat="count", hjust = -0.1, size=4.2)
      # Add a title
      p <- p + labs(title = paste(col),
                    subtitle = paste0('N = ', format(N, big.mark = ',')
                    ))
      
      # (4) Aesthetics #########################################################
      p <- p + theme_minimal()
      p <- p + theme(axis.title.y = element_blank(),
                     axis.title.x = element_blank(),
                     axis.text.x = element_blank(),
                     axis.text.y = element_text(colour='black', size = 12.5),
                     plot.margin = margin(r=25, l=1, b=0, t=1),
                     plot.title = element_text(size = 16.5),
                     plot.subtitle = element_text(size = 12.5),
                     panel.grid = element_blank()
                     )
      # Flip coordinates
      p <- p + coord_flip(clip = 'off')
      
      # (5) Print and export   #################################################
      print(p)
      # Export
      ggsave(paste(counter, '_', col, '.png'), path = 'plots/', 
             bg = 'white'
      )
    } 
  }
}

Output

make_bar_plots(df)

Plot 1

Plot 2

Discussion

We see that bar width is the same in both plots. This was achieved with the argument width = 0.35*n_levels/3 that is included in the function.

That said, the spacing between bars is different for Cat, Dog, and Horse (Plot 1), compared to Male and Female (Plot 2).

What I have tried:

  1. In the ggsave() call, to adjust the height of the generated image using a proportion (e.g., height = 2*n_levels/3). That said, the resulting plots have different bar width, see illustration below.
  2. In the aesthetics part of the function, to adjust aspect ratio (e.g., p + theme(aspect.ratio = 1/4)). This too makes the width of the bars different across plots.
  3. A combination of the above, with the same undesired results.

Illustration when using height = 2*n_levels/3

Plot 1

Plot 2

My 2 questions:

  1. How do we obtain the same spacing between bars across all plots, while also keeping bar width identical across all plots?
  2. Furthermore, how do we save an image whose height will be automatically adjusted to the number of bars? In my example (first two plots of this post), the height of the Sex plot (2 bars) would be smaller than that of the Animal plot (3 bars). Currently, both have the same height.

Solution

  • I get decent results by setting width = 0.4 (pick your favorite value) in geom_bar and adjusting the plot output to reflect the header height (here "1") and an amount (here "0.5") per level. Tweak to taste, depending on the actual ratio of your header height : category height.

    ggsave(paste(counter, '_', col, '.png'), path = 'plots/', 
                 height = 1 + 0.5*n_levels,
                 bg = 'white'
          )
    

    enter image description here

    enter image description here