Similarly to this question, I want to generate several and separate bar plots that always have the same bar width. That said, I also want that in each of these plots, the spacing between the bars is the same.
I also need these plots to be saved individually, and that each saved image size is proportional to the number of bars. I do not want to use facets or grids (which means similar questions on SO does not apply to me). Furthermore, I need a geom_bar()
solution together with ggsave()
.
REPRODUCIBLE EXAMPLE
Libraries
library(tidyverse)
library(ggplot2)
Mock data (in reality, I have hundreds of variables and up to 15 levels for some factors)
df <- structure(list(Animal = structure(c(1L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 3L, 3L), levels = c("Cat", "Dog", "Horse"), class = "factor"),
Sex = structure(c(2L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L), levels = c("Female",
"Male"), class = "factor")), row.names = c(NA, 10L), class = "data.frame")
Function to generate bar plots
# Create counter
counter <- 0
# Function
make_bar_plots <- function(df){
# We loop over the columns
for (col in colnames(df)){
# (1) Preparation ##########################################################
# Store N
N <- sum(!is.na(df[col]))
# Store number of levels
n_levels <- length(unique(df[!is.na(df[col]), col]))[1]
# Update counter at each iteration
counter <- counter + 1
# (2) Basic bar plot #######################################################
# We only plot factors
if (is.factor(df[[col]])) {
p <- ggplot(data = df[col] %>% filter(!is.na(df[col])),
aes(x=fct_rev(fct_infreq(!!sym(col)))))
# Compute percents for bars
p <- p + geom_bar(aes(y = (..count..)/sum(..count..)),
width = 0.35*n_levels/3, # We make column width proportional to number of levels!
fill = 'black',
size = 1.5
)
# (3) Annotations ########################################################
# Add percent as annotation
p <- p + geom_text(aes(label = scales::percent((..count..)/sum(..count..),
accuracy = 0.1L),
y = ((..count..)/sum(..count..))),
stat="count", hjust = -0.1, size=4.2)
# Add a title
p <- p + labs(title = paste(col),
subtitle = paste0('N = ', format(N, big.mark = ',')
))
# (4) Aesthetics #########################################################
p <- p + theme_minimal()
p <- p + theme(axis.title.y = element_blank(),
axis.title.x = element_blank(),
axis.text.x = element_blank(),
axis.text.y = element_text(colour='black', size = 12.5),
plot.margin = margin(r=25, l=1, b=0, t=1),
plot.title = element_text(size = 16.5),
plot.subtitle = element_text(size = 12.5),
panel.grid = element_blank()
)
# Flip coordinates
p <- p + coord_flip(clip = 'off')
# (5) Print and export #################################################
print(p)
# Export
ggsave(paste(counter, '_', col, '.png'), path = 'plots/',
bg = 'white'
)
}
}
}
Output
make_bar_plots(df)
Discussion
We see that bar width is the same in both plots. This was achieved with the argument width = 0.35*n_levels/3
that is included in the function.
That said, the spacing between bars is different for Cat, Dog, and Horse (Plot 1), compared to Male and Female (Plot 2).
What I have tried:
ggsave()
call, to adjust the height of the generated image using a proportion (e.g., height = 2*n_levels/3
). That said, the resulting plots have different bar width, see illustration below.p + theme(aspect.ratio = 1/4)
). This too makes the width of the bars different across plots.Illustration when using height = 2*n_levels/3
My 2 questions:
I get decent results by setting width = 0.4
(pick your favorite value) in geom_bar
and adjusting the plot output to reflect the header height (here "1") and an amount (here "0.5") per level. Tweak to taste, depending on the actual ratio of your header height : category height.
ggsave(paste(counter, '_', col, '.png'), path = 'plots/',
height = 1 + 0.5*n_levels,
bg = 'white'
)