I need to update column values conditionally on other columns in some PostgreSQL database table. I managed to do it writing a SQL statement in R and executing it with dbExecute
from DBI
package.
library(dplyr)
library(DBI)
# Establish connection with database
con <- dbConnect(RPostgreSQL::PostgreSQL(), dbname = "myDb",
host="localhost", port= 5432, user="me",password = myPwd)
# Write SQL update statement
request <- paste("UPDATE table_to_update",
"SET var_to_change = 'new value' ",
"WHERE filter_var = 'filter' ")
# Back-end execution
con %>% dbExecute(request)
Is it possible to do so using only dplyr
syntax ? I tried, out of curiosity,
con %>% tbl("table_to_update") %>%
mutate(var_to_change = if (filter_var == 'filter') 'new value' else var_to_change)
which works in R but obviously does nothing in db since it uses a select
statement. copy_to
allows only for append
and overwrite
options, so I can't see how to use it unless deleting then appending the filtered observations...
Current dplyr 0.7.1 (with dbplyr 1.1.0) doesn't support this, because it assumes that all data sources are immutable. Issuing an UPDATE
via dbExecute()
seems to be the best bet.
For replacing a larger chunk in a table, you could also:
copy_to()
.DELETE FROM ... WHERE id IN (SELECT id FROM <temporary table>)
INSERT INTO ... SELECT * FROM <temporary table>
Depending on your schema, you might be able to do a single INSERT INTO ... ON CONFLICT DO UPDATE
instead of DELETE
and then INSERT
.