rdrake-r-package

Progress bar within drake functions


I'm trying to implement progress bars within function for use in a drake-r project. I am using the progress package for the progress_bar R6 class. The following example produces the expected progress bar:

library(dplyr)
library(purrr)
library(progress)

data <- mtcars %>%
    split(.$carb)

n <- length(data)

pb <- progress::progress_bar$new(total = n)

data <- data %>%
    map(~{pb$tick()
      Sys.sleep(2)
      lm(mpg ~ wt, data = .x)
      })

If I put this into my drake workflow, a new progress bar displays for each iteration:

fit_lm <- function() {
  data <- mtcars %>%
    split(.$carb)

  n <- length(data)

  pb <- progress::progress_bar$new(total = n)

  data <- data %>%
    map(~{pb$tick()
      Sys.sleep(2)
      lm(mpg ~ wt, data = .x)
      })

  return(data)
}

plan <- drake_plan(
  models = fit_lm()
)

make(plan)

Console output: enter image description here

How can I modify the function to display only one progress bar that updates on each iteration?


Solution

  • As I mentioned before, drake captures messages for reproducibility, so there is friction with the progress package. But as Adam Kowalczewski pointed out at https://community.rstudio.com/t/including-a-progress-bar-in-a-drake-plan-step/42516, dplyr has a progress bar of its own, and you can make it print to stdout with pb$tick()$print(). This worked for me:

    library(drake)
    library(dplyr)
    library(purrr)
    
    fit_lm <- function() {
      data <- mtcars %>%
        split(.$carb)
      n <- length(data)
      pb <- progress_estimated(n = n)
      data <- data %>%
        map(~{
          pb$tick()$print()
          Sys.sleep(2)
          lm(mpg ~ wt, data = .x)
        })
      return(data)
    }
    
    plan <- drake_plan(
      models = fit_lm()
    )
    
    make(plan)