rdrake-r-package

Using code_to_plan and target(..., format = "fst") in drake


I really like using the code_to_plan function when constructing drake plans. I also really using target(..., format = "fst") for big files. However I am struggling to combine these two workflows. For example if I have this _drake.R file:

# Data --------------------------------------------------------------------

data_plan = code_to_plan("code/01-data/data.R")
join_plan = code_to_plan("code/01-data/merging.R")


# Cleaning ----------------------------------------------------------------

cleaning_plan = code_to_plan("code/02-cleaning/remove_na.R")


# Model -------------------------------------------------------------------

model_plan = code_to_plan("code/03-model/model.R")


# Combine Plans
dplan = bind_plans(
  data_plan,
  join_plan,
  cleaning_plan,
  model_plan
  )

config <- drake_config(dplan)

This works fine when called with r_make(r_args = list(show = TRUE))

As I understand it though target can only be used within a drake_plan. If I try something like this:

dplan2 <- drake_plan(full_plan = target(dplan, format = "fst"))
config <- drake_config(dplan2)

I get an r_make error like this:

target full_plan Error in fst::write_fst(x = value$value, path = tmp) : Unknown type found in column. In addition: Warning message: You selected fst format for target full_plan, so drake will convert it from class c("drake_plan", "tbl_df", "tbl", "data.frame") to a plain data frame.

Error: --> in process 18712

See .Last.error.trace for a stack trace.

So ultimately my question is where does one specify special data formats for targets when you are using code_to_plan?

Edit

Using @landau helpful suggestion, I defined this function:

add_target_format <- function(plan) {

  # Get a list of named commands.
  commands <- plan$command
  names(commands) <- plan$target

  # Turn it into a good plan.
  do.call(drake_plan, commands)

}

So that this would work:

dplan = bind_plans(
  data_plan,
  join_plan,
  cleaning_plan,
  model_plan
  ) %>%
  add_target_format()

Solution

  • It is possible, but not convenient. Here is a workaround.

    writeLines(
      c(
        "x <- small_data()",
        "y <- target(large_data(), format = \"fst\")"
      ),
      "script.R"
    )
    
    cat(readLines("script.R"), sep = "\n")
    #> x <- small_data()
    #> y <- target(large_data(), format = "fst")
    
    library(drake)
    
    # Produces a plan, but does not process target().
    bad_plan <- code_to_plan("script.R")
    bad_plan
    #> # A tibble: 2 x 2
    #>   target command                             
    #>   <chr>  <expr>                              
    #> 1 x      small_data()                        
    #> 2 y      target(large_data(), format = "fst")
    
    # Get a list of named commands.
    commands <- bad_plan$command
    names(commands) <- bad_plan$target
    
    # Turn it into a good plan.
    good_plan <- do.call(drake_plan, commands)
    good_plan
    #> # A tibble: 2 x 3
    #>   target command      format
    #>   <chr>  <expr>       <chr> 
    #> 1 x      small_data() <NA>  
    #> 2 y      large_data() fst
    

    Created on 2019-12-18 by the reprex package (v0.3.0)