rr-packagetemporary-files

How to write models to tempdir()


I'm creating an R package that automatically creates trained models. One of the goals is to allow the user to save the trained models for use with future data. The trained models were originally saved to the Environment. Comments from CRAN state, "In your examples/vignettes/tests you can write to tempdir()."

Also from CRAN, "Omit any default path in your functions and write to tempdir() in examples, vignettes or tests." https://contributor.r-project.org/cran-cookbook/code_issues.html#writing-files-and-directories-to-the-home-filespace

However, all my attempts to save the trained models to tempdir() return an empty directory.

Here's a simple example:

library(MASS)
lm1 <- lm(medv ~ ., data = Boston)
tempfile('lm1')

The file lm1 is fine, that's what I want to share with the user. The last line returns a location on my computer, but that directory is empty:

> tempfile('lm1')
[1] "/var/folders/2c/hvxqjjcn3sq5lcshxhv8wnsw0000gp/T//Rtmp4vWDDq/lm1269d23afd784"

The problem is identical if the file is named:

library(MASS)
lm1 <- lm(medv ~ ., data = Boston)
model1 <- tempfile('lm1')
> model1
[1] "/var/folders/2c/hvxqjjcn3sq5lcshxhv8wnsw0000gp/T//Rtmp4vWDDq/lm1269d6d6e2d69"

Same issue if I create a specific temp directory:

temp1 <- tempdir()
library(MASS)
lm1 <- lm(medv ~ ., data = Boston)
model1 <- tempfile('lm1')

What's the proper way to write to tempdir() that will allow the user to access the trained models on future (untrained) data for an R package?

empty temp directory


Solution

  • The purpose of tempfile(.) is to

    Create Names for Temporary Files
    

    It tries to guarantee their uniqueness,

         The names are very likely to be unique among calls to ‘tempfile’
         in an R session and across simultaneous R sessions (unless
         ‘tmpdir’ is specified).  The filenames are guaranteed not to be
         currently in use.
    

    though to be fair, since the file is not actually created atomically, there may be a race condition where another near-simultaneous call (using the same tmpdir=) could theoretically return the same filename (that does not yet exist). (It's highly unlikely, without a doubt.)

    It does not create files, it does not save an object to a file, or anything else. It is up to the caller to figure out what to do with the ephemeral filename.

    I suggest something like:

    library(MASS)
    lm1 <- lm(medv ~ ., data = Boston)
    tf <- tempfile('lm1_', fileext='.rds')
    saveRDS(lm1, tf)
    

    This is one way to save it. One could always use save(lm1, file=tf) if you prefer .rda files, or if you want some summary of the model, you could derive a dataframe (further code, not my point) and save it as a CSV or xlsx file. Again, it's completely up to the caller to figure out what to save, how to save it, and to actually do the saving.