I have ~50 data files (subjects) that I process individually before I combine them in a data.frame for modelling. I'm unsure how to best use {targets} for this.
I tried using dynamic branching, but I'm unsure how to keep track of subject IDs with this approach. I my current approach I have all data in a named list where first level names are subject IDs, but with targets the names are arbitrary.
I know this is not really a specific questions, but I'm hoping to be pointed towards an appropriate solution instead of getting a "correct" answer for a wrong question.
This is the pattern that I normally use
tar_files(
file_paths,
"file_paths_folder" %>%
list.files(full.names = TRUE)
),
tar_target(
processed_files,
file_paths%>%
readxl::read_excel() %>% # can be anything read csv, parquet etc.
janitor::clean_names() %>% # start processing
mutate_at(vars(a,b,c), as.Date, format = "%Y-%m-%d"), # can be really complex operations
pattern = map(file_paths)
)