I create a set of files in my drake plan. I want to copy a subset of these files to another location.
The following code almost achieves that. However, drake's dependency tracking of file changes is lost after taking the subset of file targets that I want to copy.
How can I combine/subset dynamic file targets without losing drake's dependency tracking?
copy_file <- function(file) {
file_copy <- paste0(file, "_copy")
file.copy(from = file, to = file_copy, overwrite = TRUE)
file_copy
}
herb_1_a <- "parsley"
plan <- drake::drake_plan(
file_1 = target(
{
writeLines(herb_1_a, "file_1_a") # Second run
writeLines("sage", "file_1_b")
c("file_1_a", "file_1_b")
},
format = "file"
),
file_2 = target(
{
writeLines("rosemary", "file_2_a")
writeLines("thyme", "file_2_b")
c("file_2_a", "file_2_b")
},
format = "file"
),
files_to_copy = str_subset(
c(file_1, file_2),
"_a$"
),
file_copies = target(
copy_file(files_to_copy),
dynamic = map(files_to_copy),
format = "file"
)
)
drake::make(plan)
#> ▶ target file_2
#> ▶ target file_1
#> ▶ target files_to_copy
#> ▶ dynamic file_copies
#> > subtarget file_copies_5e57e9ee
#> > subtarget file_copies_ae26ecf9
#> ■ finalize file_copies
readLines("file_1_a")
#> [1] "parsley"
readLines("file_1_a_copy")
#> [1] "parsley"
herb_1_a <- 'banana'
drake::make(plan)
#> ▶ target file_1
#> ▶ target files_to_copy
readLines("file_1_a")
#> [1] "banana"
readLines("file_1_a_copy") # I want this banana
#> [1] "parsley"
Created on 2020-09-24 by the reprex package (v0.3.0)
I think what will solve this is creating a dynamically-mapped set of dynamic input files right before the copying step. In other words, files_to_copy
should be a dynamic target of dynamic files. Sketch:
plan <- drake::drake_plan(
file_1 = target(
{
writeLines(herb_1_a, "file_1_a") # Second run
writeLines("sage", "file_1_b")
c("file_1_a", "file_1_b")
},
format = "file"
),
file_2 = target(
{
writeLines("rosemary", "file_2_a")
writeLines("thyme", "file_2_b")
c("file_2_a", "file_2_b")
},
format = "file"
),
files_to_copy_group = str_subset(
c(file_1, file_2),
"_a$"
),
files_to_copy = target(
files_to_copy_group,
dynamic = map(files_to_copy_group),
format = "file"
),
file_copies = target(
copy_file(files_to_copy),
dynamic = map(files_to_copy),
format = "file"
)
)