rggplot2ggsave

Big change in computational time if I store the ggplot object before saving the plot


Please have a look at the reprex below. Even on a very seasoned laptop, it takes about one minute to get the job done and save a large colored scatterplot as a png file using ggplot2.

What puzzles me is that if I change slightly the last part of the code and I write

#plot
gpl<-ggplot(df3_clip, aes(x, y)) + 
    ## geom_point(aes(color = dist), shape=46, alpha=.01) +
    geom_scattermore(aes(color = dist), shape=46, alpha=.01) +
    
  scale_color_gradientn(colors=pulse_pal(500)) +
  opt

ggsave(gpl,"pulse.png", dpi=300
       )


i.e. I store the ggplot2 plot as gpl and then I explicitly save gpl, then the code takes forever to complete (actually I do not even know if it does). Does anyone know why this happens? Since I usually write my code with gpl<-ggplot2(...) I wonder if I have always given up performance.

Many thanks!

rm(list=ls())

library(Rcpp)
library(ggplot2)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(purrr)
library(scattermore)
library(tictoc())

tic()

opt = theme(legend.position  = "none",
            panel.background = element_rect(fill="black"),
            axis.ticks       = element_blank(),
            panel.grid       = element_blank(),
            axis.title       = element_blank(),
            axis.text        = element_blank())



## #bedhead
cppFunction('DataFrame createTrajectory(int n, double x0, double y0, 
            double a, double b) {
            // create the columns
            NumericVector x(n);
            NumericVector y(n);
            x[0]=x0;
            y[0]=y0;
            for(int i = 1; i < n; ++i) {
            x[i] = sin(x[i-1]*y[i-1]/b)*y[i-1]+cos(a*x[i-1]-y[i-1]);
            y[i] = x[i-1]+sin(y[i-1])/b;
            }
            // return a new data frame
            return DataFrame::create(_["x"]= x, _["y"]= y);
            }
            ')

a=1
b=0.75

df3=createTrajectory(4000000, 1, 1, a, b)

#something new
#color by dist from origin
eu_dist <- function(x1, y1, x2, y2) {
  sqrt((x1-x2)^2 + (y1-y2)^2)
}


df3$dist <- map2_dbl(df3$x, df3$y, ~eu_dist(.x, .y, 0, 0))

pulse_pal <- colorRampPalette(c("#FE1BE1", "#A300FF", "#57F7F5", "#57F7F5", "#57F7F5", "#57F7F5"))
pulse_pal2 <- colorRampPalette(c("#FE1BE1", "#FE1BE1", "#57F7F5", "#57F7F5", "#57F7F5", "#57F7F5", "#57F7F5"))
pulse_pal3 <- colorRampPalette(c("#FE1BE1", "#AA1BFE", "#57F7F5", "#57F7F5", "#57F7F5"))
pulse_pal4 <- colorRampPalette(c("#AA1BFE", "#57F7F5", "#57F7F5"))

#clip outer points
xmax <- max(df3$x)/2.5
xmin <- min(df3$x)/2.5
ymax <- max(df3$y)/2.5
ymin <- min(df3$y)/2.5

df3_clip <- df3 |> 
  filter(x > xmin & x < xmax) |> 
  filter(y > ymin & y < ymax)

print("df3clip ready")
#> [1] "df3clip ready"
    
#plot
ggplot(df3_clip, aes(x, y)) + 
    ## geom_point(aes(color = dist), shape=46, alpha=.01) +
    geom_scattermore(aes(color = dist), shape=46, alpha=.01) +
    
  scale_color_gradientn(colors=pulse_pal(500)) +
  opt


ggsave("pulse.png", dpi=300
       )
#> Saving 7 x 5 in image

toc()
#> 63.8 sec elapsed

sessionInfo()
#> R version 4.2.2 (2022-10-31)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Debian GNU/Linux 11 (bullseye)
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.13.so
#> 
#> locale:
#>  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
#>  [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
#>  [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] tictoc_1.1      scattermore_0.8 purrr_1.0.1     dplyr_1.1.0    
#> [5] ggplot2_3.4.0   Rcpp_1.0.9     
#> 
#> loaded via a namespace (and not attached):
#>  [1] compiler_4.2.2    pillar_1.8.1      highr_0.9         R.methodsS3_1.8.2
#>  [5] R.utils_2.12.1    tools_4.2.2       digest_0.6.30     evaluate_0.17    
#>  [9] lifecycle_1.0.3   tibble_3.1.8      gtable_0.3.1      R.cache_0.16.0   
#> [13] pkgconfig_2.0.3   rlang_1.0.6       reprex_2.0.2      cli_3.6.0        
#> [17] yaml_2.3.6        xfun_0.34         fastmap_1.1.0     withr_2.5.0      
#> [21] styler_1.8.0      stringr_1.5.0     knitr_1.40        systemfonts_1.0.4
#> [25] generics_0.1.3    fs_1.5.2          vctrs_0.5.2       tidyselect_1.2.0 
#> [29] grid_4.2.2        glue_1.6.2        R6_2.5.1          textshaping_0.3.6
#> [33] fansi_1.0.4       rmarkdown_2.17    farver_2.1.1      magrittr_2.0.3   
#> [37] scales_1.2.1      htmltools_0.5.3   colorspace_2.0-3  ragg_1.2.4       
#> [41] labeling_0.4.2    utf8_1.2.3        stringi_1.7.12    munsell_0.5.0    
#> [45] R.oo_1.25.0

Created on 2023-03-03 with reprex v2.0.2


Solution

  • You are calling ggsave with arguments by position rather by name; you have the plot in the first position; but the first position would be a file name ... the easiest thing you can do so that your code runs is to be explicit

    ggsave(plot=gpl,filename = "pulse.png", dpi=300)