rtargetrsqlite

RSQLite & targets in R


I want to create a database for efficient queries, but using with targets. Is there a better alternative to open and close connection?

  tar_target(database, format = "file", command = {
    db_file_name <- Sys.getenv("DB_PATH", "database.sqlite")
    db <- dbConnect(SQLite(), db_file_name)
    
    dta_to_db(db, crsp_daily, "crsp_daily")
    dta_to_db(db, crsp_monthly, "crsp_monthly")
    dta_to_db(db, analist_coverage, "analist_coverage")
    dbwriteTable(db, industry_classification, "industry_classification")
    
    dbDisconnect(db)
    db_file_name
  })

Solution

  • I've started using sqltargets and duckdb and the pattern I now use is:

    In an R script sourced in _targets.R:

    load_tables_to_db <- function(...) {
      tables <- lst(...)
      path_to_db <- fs::path("<path_to_db>")
    
      con <- DBI::dbConnect(
        duckdb::duckdb(),
        dbdir = path_to_db,
        read_only = FALSE
      )
      on.exit(DBI::dbDisconnect(con, shutdown = TRUE))
      walk2(tables, names(tables), \(df, nm) DBI::dbWriteTable(conn = con, name = nm, value = df, overwrite = TRUE))
    }
    

    And in _targets.R:

    ...
      tar_target(table1, mtcars),
      tar_target(table2, iris),
      tar_target(tables_loaded_to_db, load_tables_to_db(table1, table2)),
      tar_sql(report1, "query1.sql")
    ...