Let's say I have a list of dataframe and I want to plot them one by one using lapply
and geom_tile
, and ggsave
it locally to my computer:
library(ggplot2)
df1 <- data.frame(grp = rep("A", 30),
point = 1:30,
y = rep(1, 30),
var = sample(c(T, F), 30, replace = T))
df2 <- data.frame(grp = rep("B", 100),
point = 1:100,
y = rep(1, 100),
var = sample(c(T, F), 100, replace = T))
df_list <- list(df1, df2)
# this should save "A.png" and "B.png" to the current directory
lapply(df_list, function(x) {
p <- ggplot(x, aes(point, y, fill = var)) + geom_tile(color = "black") + theme_minimal()
ggsave(filename = paste0(unique(x[["grp"]]), ".png"), plot = p, width = 8, height = 3)
})
Normally, setting ggsave(width = 8)
is perfect for most of the dataframes, but for some of them, they have unusually large number of observations that width = 8
will be too crowded.
For example, df1
(A.png) is fine with width = 8
:
However, df2
(B.png) might be too crowded:
For example, in df2
, I would like to save it with ggsave(width = 25)
.
Is there a way to change the ggsave(width)
argument automatically based on the number of observations in the dataframe that we are plotting to obtain an optimally scaled plot (e.g. if the dataframe has more than 100 datapoints, width
should be greater than 25, and I'd like it to scale automatically)?
I know it's very hard to define optimally scaled, but the point here is to find a way to have different ggsave(width)
in each lapply
iteration.
Thanks in advance!
We can use an ifelse
or case_when
statement to set the width based on nrow()
.
lapply(df_list, function(x) {
p <- ggplot(x, aes(point, y, fill = var)) +
geom_tile(color = "black") +
theme_minimal()
ggsave(filename = paste0(unique(x[["grp"]]), ".png"),
plot = p,
width = ifelse(nrow(x) > 80, 25, 4),
height = 3)
})