I have a JDBC connection and would like to query data from one schema and save to another
library(tidyverse)
library(dbplyr)
library(rJava)
library(RJDBC)
# access the temp table in the native schema
tbl(conn, "temp")
temp_ed <- temp %*% mutate(new = 1)
# I would like to save temp_ed to a new schema "schmema_new"
I would like to use something like dbplyr::compute()
but define the output schema specifically. It seems dbplyr::copy_to
could be used, but would require bringing the data through the local machine.
I want to use something like RJDBC::dbSendUpdate()
but which would ideally integrate nicely with the data manipulating pipeline above.
I do this using dbExecute
from the DBI
package.
The key idea is to extract the query that defines the current remote table and make this a sub-query in a larger SQL query that writes the table. This requires that (1) the schema exists, (2) you have permission to write new tables, and (3) you know the correct SQL syntax.
Doing this directly might look like:
tbl(conn, "temp")
temp_ed <- temp %*% mutate(new = 1)
save_table_query = paste(
"SELECT * INTO my_database.schema_new.my_table FROM (\n",
dbplyr::sql_render(temp_ed),
"\n) AS sub_query"
)
dbExecute(conn, as.character(save_table_query))
INTO
is the clause for writing a new table in SQL server (the flavour of SQL I use). You will need to find the equivalent clause for your database.
In practice I use a custom function that looks something like this:
write_to_database <- function(input_tbl, db, schema, tbl_name){
# connection
tbl_connection <- input_tbl$src$con
# SQL query
sql_query <- glue::glue(
"SELECT *\n",
"INTO {db}.{schema}.{tbl_name}\n",
"FROM (\n",
dbplyr::sql_render(input_tbl),
"\n) AS sub_query"
)
result <- dbExecute(tbl_connection, as.character(sql_query))
}
Applying this in your context:
tbl(conn, "temp")
temp_ed <- temp %*% mutate(new = 1)
write_to_database(temp_ed, "my_database", "schema_new", "my_table")