rproduction-environmentrdata

Programmatically extract an object from collection of RData files


We work in a production environment, with large datasets assembled from API calls saved as RData files to retain the full environment and subsequent data summaries. The RData files are very large, and contain multiple dataframe objects generated with a standard analysis workflow with similar names and structures.

I'm looking for a clean way to walk through the collection of RData files, pull a named object from each, then assemble into an AllCohorts dataframe for analysis.


Solution

  • The following modification (1) allows a request for several objects and (2) avoid using an assignment. i.e., places the objects in the calling environnement

    extractRData <- function(file, objects) {
      objectsNotFound <- c()
      E <- new.env()
      load(file=file, envir=E)
      for (object in objects) {
        temp <- try({
          get(object, envir=E, inherits=F)
        })
        if (substr(temp[1],1,5) == "Error") {
          objectsNotFound <- c(objectsNotFound,object)
        } else {
          eval(parse(text=paste(object," <<- temp")))
        }
      }
      return(objectsNotFound)
    }