targets-r-package

implications of using callr_function = NULL in targets package


I was wondering what happens when callr_function = NULL? Is it just issues with things maybe being in the environment/side effects?

Mainly wondering because I was passing quite large spatio-temporal arrays (0.5 to 5 gigs) and callr serialization via saveRDS is quite slow.

The two things I was thinking about was forking callr and dropping in a different save function or just using callr_function = NULL.


Solution

  • Ordinarily, targets runs the pipeline in a fresh new reproducible external R session. callr_function = NULL just says to run the pipeline in the current R session. I only recommend this for debugging because in serious use cases you could accidentally invalidate some targets based on changed data in your global environment. callr_function = NULL will probably not help solve issues with large memory. For that, I recommend selecting a more efficient storage format for your data, e.g. tar_target(..., format = "feather"). You could also try tar_option_set(memory = "transient", garbage_collection = TRUE) for better memory efficiency.