I'm creating an R package that has some included datasets that I both want to export for the user to use and to use internally in the package's function.
For example, let's say I create a dataset called measurements
like this:
measurements <- data.frame(id = c(1:10), value = runif(10))
usethis::use_data(measurements, overwrite = TRUE)
That allows the measurements dataframe to be accessible externally to the user just by referencing measurements
.
Now, I also want to write a package function that uses the same data frame internally:
#' fn_docalc
#'
#' @param x Value to multiply by
#'
#' @return Measurements dataframe multiplied by x
#' @export
fn_docalc <- function(x){
measurements$value <- measurements$value * x
measurements
}
This works fine, but the one case where it fails is if the user loads the package, and also happens to create their own variable called measurements
in the global environment. If that occurs, then fn_docalc
operates on that new global variable instead of on the package's variable. How can I properly write the function/package to always reference the internal measurements
variable when fn_docalc
is called even if a different global version of measurements
exists?
You used usethis::use_data(measurements, overwrite = TRUE)
; this put your dataset in the data
subdirectory. It has somewhat weird semantics.
If you have LazyData: true
in your DESCRIPTION
file, then the data object is put into the exports from the package, but it is not in the internal environment that functions use. In that case your functions will need the myPkgname::
prefix.
If you don't have that LazyData:
line, or set it to false
, then the data is not visible at all until you call the data()
function, which by default loads it into the global environment.
For your use case, where you want the data available both to users and to your own functions, neither of these makes sense. You want the dataset visible in both environments.
To get it into the internal environment, you create it in one of your .R
files in the R
directory. For your sample data, just put
measurements <- data.frame(id = c(1:10), value = runif(10))
in one of those files. For a larger dataset you might want to store it in a compressed format somewhere (e.g. in inst/extdata
), and have your .R
file read it in at package install time.
To also get it into the exports, you specify it in your NAMESPACE
file, or let Roxygen do that for you, by using @export
in the .R
file.