I want to create a database for efficient queries, but using with targets. Is there a better alternative to open and close connection?
tar_target(database, format = "file", command = {
db_file_name <- Sys.getenv("DB_PATH", "database.sqlite")
db <- dbConnect(SQLite(), db_file_name)
dta_to_db(db, crsp_daily, "crsp_daily")
dta_to_db(db, crsp_monthly, "crsp_monthly")
dta_to_db(db, analist_coverage, "analist_coverage")
dbwriteTable(db, industry_classification, "industry_classification")
dbDisconnect(db)
db_file_name
})
I've started using sqltargets and duckdb and the pattern I now use is:
In an R script sourced in _targets.R
:
load_tables_to_db <- function(...) {
tables <- lst(...)
path_to_db <- fs::path("<path_to_db>")
con <- DBI::dbConnect(
duckdb::duckdb(),
dbdir = path_to_db,
read_only = FALSE
)
on.exit(DBI::dbDisconnect(con, shutdown = TRUE))
walk2(tables, names(tables), \(df, nm) DBI::dbWriteTable(conn = con, name = nm, value = df, overwrite = TRUE))
}
And in _targets.R
:
...
tar_target(table1, mtcars),
tar_target(table2, iris),
tar_target(tables_loaded_to_db, load_tables_to_db(table1, table2)),
tar_sql(report1, "query1.sql")
...