I am trying to use a parametrized query multiple times with package:ROracle
.
library(DBI)
conn <- dbConnect(ROracle::Oracle(), ...)
statement <- "select * from bigtable where name = :1"
name <- c("Bob", "Alice", "Carol")
I expected the query to be executed for each row of the data frame (i.e. as for to the param
argument in DBI::dbGetQuery()
), but it is not the case:
dbGetQuery(conn, statement, data.frame(name))
# Error in .oci.GetQuery(conn, statement, data = data, prefetch = prefetch, :
# bind data has too many rows
I can loop over the names, but I think it may be less efficient (I am new to RDBMS but I read about optimization/caching):
lapply(name, function(i) dbGetQuery(conn, statement, data.frame(i)))
How can it be done efficiently?
There are some related questions that does not solve the problem:
package:ROracle
.I think there are a couple of ways to approach this.
Insert (to a temp table), single-query on a join:
mynames <- data.frame(name = c("Bob", "Alice", "Carol"))
dbWriteTable(con, "mytemp", mynames)
alldat <- dbGetQuery(con, "
select bt.*
from bigtable bt
inner join mytemp my on bt.name=my.name")
This method has the advantage of being a single query. If you need it split by name later, you can always use R's split
.
(Apparently this doesn't work in ROracle
?) Create a stored-statement and repeatedly bind/fetch:
res <- dbSendStatement(con, "select * from bigtable where name = :1")
eachdat <- lapply(name, function(nm) { dbBind(res, list(nm)); dbFetch(res); })
dbClearResult(res)
I do not tend to use this method, though that's as much preference as anything else.
Use SELECT ... NAME IN (...)
statement <- paste0("select * from bigtable where name in (", paste(paste0(":", seq_along(name)), collapse = ","), ")")
alldat <- dbGetQuery(con, statement, list(name))
The paste(...)
in the statement produces ":1,:2,:3"
(counting along the length of name
), which should be compatible with your original query mechanism.
Having said that ... every server has its own optimization methods, but I suspect that with your lapply(.., \(z) dbGetQuery(..))
, your server will cache the optimized version since your query does not change between iterations. A DBA more familiar with Oracle might be able to clarify or refute my claim, but I have a hard time believing that it does not cache the query for some time (in a sql statement cache).