
Big change in computational time if I store the ggplot object before saving the plot

Please have a look at the reprex below. Even on a very seasoned laptop, it takes about one minute to get the job done and save a large colored scatterplot as a png file using ggplot2.

What puzzles me is that if I change slightly the last part of the code and I write

gpl<-ggplot(df3_clip, aes(x, y)) + 
    ## geom_point(aes(color = dist), shape=46, alpha=.01) +
    geom_scattermore(aes(color = dist), shape=46, alpha=.01) +
  scale_color_gradientn(colors=pulse_pal(500)) +

ggsave(gpl,"pulse.png", dpi=300

i.e. I store the ggplot2 plot as gpl and then I explicitly save gpl, then the code takes forever to complete (actually I do not even know if it does). Does anyone know why this happens? Since I usually write my code with gpl<-ggplot2(...) I wonder if I have always given up performance.

Many thanks!


#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>     filter, lag
#> The following objects are masked from 'package:base':
#>     intersect, setdiff, setequal, union


opt = theme(legend.position  = "none",
            panel.background = element_rect(fill="black"),
            axis.ticks       = element_blank(),
            panel.grid       = element_blank(),
            axis.title       = element_blank(),
            axis.text        = element_blank())

## #bedhead
cppFunction('DataFrame createTrajectory(int n, double x0, double y0, 
            double a, double b) {
            // create the columns
            NumericVector x(n);
            NumericVector y(n);
            for(int i = 1; i < n; ++i) {
            x[i] = sin(x[i-1]*y[i-1]/b)*y[i-1]+cos(a*x[i-1]-y[i-1]);
            y[i] = x[i-1]+sin(y[i-1])/b;
            // return a new data frame
            return DataFrame::create(_["x"]= x, _["y"]= y);


df3=createTrajectory(4000000, 1, 1, a, b)

#something new
#color by dist from origin
eu_dist <- function(x1, y1, x2, y2) {
  sqrt((x1-x2)^2 + (y1-y2)^2)

df3$dist <- map2_dbl(df3$x, df3$y, ~eu_dist(.x, .y, 0, 0))

pulse_pal <- colorRampPalette(c("#FE1BE1", "#A300FF", "#57F7F5", "#57F7F5", "#57F7F5", "#57F7F5"))
pulse_pal2 <- colorRampPalette(c("#FE1BE1", "#FE1BE1", "#57F7F5", "#57F7F5", "#57F7F5", "#57F7F5", "#57F7F5"))
pulse_pal3 <- colorRampPalette(c("#FE1BE1", "#AA1BFE", "#57F7F5", "#57F7F5", "#57F7F5"))
pulse_pal4 <- colorRampPalette(c("#AA1BFE", "#57F7F5", "#57F7F5"))

#clip outer points
xmax <- max(df3$x)/2.5
xmin <- min(df3$x)/2.5
ymax <- max(df3$y)/2.5
ymin <- min(df3$y)/2.5

df3_clip <- df3 |> 
  filter(x > xmin & x < xmax) |> 
  filter(y > ymin & y < ymax)

print("df3clip ready")
#> [1] "df3clip ready"
ggplot(df3_clip, aes(x, y)) + 
    ## geom_point(aes(color = dist), shape=46, alpha=.01) +
    geom_scattermore(aes(color = dist), shape=46, alpha=.01) +
  scale_color_gradientn(colors=pulse_pal(500)) +

ggsave("pulse.png", dpi=300
#> Saving 7 x 5 in image

#> 63.8 sec elapsed

  • You are calling ggsave with arguments by position rather by name; you have the plot in the first position; but the first position would be a file name ... the easiest thing you can do so that your code runs is to be explicit

    ggsave(plot=gpl,filename = "pulse.png", dpi=300)