I have a simple data pipeline where the data and the functional code are stored in a custom R package hosted on Azure DevOps and the final outputs are produced by quarto documents pointing to specific branches of the package.
I struggling to find a proper way to connect the quarto documents to this package in order to download it only if changed. Here's my code in the first chunck of the quarto doc:
```{r setup}
#| include: false
#| cache: false
#| message: false
#| warning: false
c("git2r", "renv", "remotes") |>
sapply(\(x) {
if (!requireNamespace(x, quietly = TRUE)) {
install.packages(x)
}
})
renv::activate()
gitcred <- git2r::cred_user_pass(
username="username",
password="repo-specific-pass")
remotes::install_git(
"https://my-azure-repo.com/_git",
build_manual = FALSE, build_vignettes = FALSE,
upgrade = FALSE,
dependencies = TRUE,
ref = "2024-04-29_ECCMID", credentials = gitcred)
renv::snapshot()
```#
This code works, but it reinstalls the package every time. What should I change to avoid this?
I would suggest using installed.packages()
to check the version number of the installed package, compare that to the version number in the DESCRIPTION of the package in Git, and then install if needed.
Untested, but something like this:
inst_version = installed.packages()["your_package", "Version"]
git_version = read.dcf("https://my-azure-repo.com/your_package/DESCRIPTION")[, "Version"]
if(inst_version != git_version) {
remotes::install_git(...) # as in your question
}
(With credit to this answer for reading the DESCRIPTION file nicely.)
Obviously this will only update if the version number is changed, so you should account for that in your workflow.