I've noticed hints/suggestions/warnings in the drake
docs suggesting use of expose_imports
to ensure that changes in imported packages are tracked reproducibly, but the docs are relatively brief on the correct usage of this.
I've now witnessed an example of the behaviour expose_imports
is designed to correct in my own usage of drake
, and I'd like to start using it.
In my case, the dependency that wasn't tracked was forcats
, which, in version 0.4.0
had a bug in fct_collapse
(Used by one of my functions) which would assign incorrect groups to the output factor.
0.4.0.9000
resolved this bug, and I updated to 0.4.0.9000
, some time ago, but did notice that targets that must have run against the old version were not invalidated.
I'm guessing that this is a problem that expose_imports
might mitigate, but I don't really understand how / where to use it.
If I make scoped calls to my.package
in my drake
plans like so:
plan <- drake::drake_plan(
mtc = mtcars,
mtc_xformed = my.package::transfom_mtc(mtc)
)
And my.package::transform_mtc()
has some dependency on another package, (Eg. forcats
) then:
expose_imports
?
prework
argument of make
?my.package/R/
?expose_imports("my.package")
? orexpose_imports("forcats")
Some clarification of this would be awesome
expose_imports()
is mostly for packages you update/reinstall a lot. For example, say you write a package to implement a new statistical method, and the package is still under active development. Meanwhile, you are also writing a journal article about the method, and you have a reproducible drake
pipeline to run simulation studies and compile the manuscript. Here, it is important to refresh the paper when you make changes to the package. In the project archetype here, your R/packages.R
file would look something like this:
library(drake)
library(tidyverse)
library(yourCustomPackage)
expose_imports(yourCustomPackage)
Then, the plan can use functions from yourCustomPackage
.
plan <- drake_plan(
analysis = custom_method(...) # from yourCustomPackages
# ...
)
Now, drake
will invalidate targets in response to changes in custom_method()
, along with any nested dependency functions of custom_method()
in yourCustomPackages
, and the dependencies of those dependencies in yourCustomPackages
, etc. (Check vis_drake_graph()
to see for yourself.)
expose_imports()
is usually something I only recommend for packages directly related to the content of your research. It is not something I usually recommend for utilities like forcats
. For those packages, I recommend renv
to prevent unexpected changes from happening to begin with. In your case, I would update forcats
, lock it down with renv
, invalidate the targets you know depend on forcats
, and trust that future changes to forcats
are unlikely to be necessary.
Scoped calls like my.package::transfom_mtc(mtc)
tell drake
to track transform_mtc()
, but not any unscoped dependency functions called from my.package::transfom_mtc(mtc)
. This is a one-foot-in-one-foot-out idea behavior that I no longer agree with. Next chance I get, I will make drake
stop tracking these calls.