I'm learning more about duckdb and exploring the various ways to "save" a database. I understand I should be able to save a duckdb database however when I try to save it, I get an error.
Error: unable to find an inherited method for function ‘dbWriteTable’ for signature ‘conn = "duckdb_connection", name = "character", value = "tbl_duckdb_connection"’
Ultimately, I am experimenting with the best ways to save and load data to duckdb by trying to save a database itself (not just the csv or parquet files).
If you have any thoughts on the benefits or challenges of this approach vs. csv /parquet please let me know.
After saving the database, I'd also like to load it as well (which I give the code for but sure if it works because I can't save it)
Lastly, I created the duckdb connection via a temp file which I understand will create a persistence file and allow me to work out of memory vs. the regular duckdb() which limits me to in memory (https://josiahparry.com/posts/2024-05-24-duckdb-and-r)
Any suggestions?
library(tidyverse)
library(duckdb)
# this works -- but I have limited understanding of the implications between duckdb() only vs. the below solution
drv <- duckdb(tempfile(fileext = ".duckdb"))
# this works
con <- dbConnect(drv)
# this works
duckdb_register(con,"diamonds",diamonds,overwrite = TRUE)
# this works
diamonds_db <- dplyr::tbl(con, sql("select * from diamonds"))
# this doesn't work -- I'm trying to save this as duckdb database object (as oppose to a csv file or parquet file) -- I am assuming this is better saving format (eg. lighter?)
duckdb::dbWriteTable(
con
,"data/diamonds.duckdb"
,diamonds_db
,append = TRUE)
)
Ideally if I can figure out how to save it, I can also read from it with what I assume will work as below
library(tidyverse)
library(duckdb)
con <- dbConnect(
duckdb::duckdb(),
dbdir = "data/diamonds.duckdb"
)
# I don't know if this works
df <- dbReadTable(con, "diamonds")
thanks to @G. Grothendieck, the problem was my approach is two fold:
Below is the corrected code
library(tidyverse)
library(duckdb)
drv <- duckdb(dbdir ="database.duckdb")
con <- dbConnect(drv)
duckdb::dbWriteTable(
con
,"diamonds.dbi"
,diamonds
,append = TRUE)
)
This will successfully save your table your database.
If you quite and start a new session, simply reconnect to your database and then use dplyr::tbl() to access it (no need to read csv files).
drv <- duckdb(dbdir ="database.duckdb")
con <- dbConnect(drv)
## this loads your table from your persistance database
tbl(con,sql("select * from 'diamonds.dbi'"))
Ensure you wrap reference to your table with ' or it won't work.