rdplyrgoogle-bigquerydbibigrquery

Error: BigQuery does not support temporary tables


I'm trying to join tables from two different datasets in the same project. How can I do this?

    library(tidyverse)
    library(bigrquery)
    
    con1 <- 
      bConnect(
        drv = bigrquery::bigquery(),
        project = PROJECT,
        dataset = "dataset_1"
      )
    con2 <- 
      bConnect(
        drv = bigrquery::bigquery(),
        project = PROJECT,
        dataset = "dataset_2"
      )
    
    A <- con1 %>% tbl("A")
    B <- con2 %>% tbl("B")
    
    inner_join(A, B,
              by = "key",
              copy = T) %>%
      collect()

Then I get the error: Error: BigQuery does not support temporary tables


Solution

  • The problem is most likely that you are using different connections to connect with the two tables. When you attempt this, R tries to copy data from one source into a temporary table on the other source.

    See this question and the copy parameter in this documentation (its a different package, but the principle is the same).

    The solution is to only use a single connection for all your tables. Something like this:

    con <- 
      bConnect(
        drv = bigrquery::bigquery(),
        project = PROJECT,
        dataset = "dataset_1"
      )
        
    A <- con %>% tbl("A")
    B <- con %>% tbl("B")
        
    inner_join(A, B,
               by = "key") %>%
      collect()
    

    You may need to leave the dataset parameter blank in your connection string, or use in_schema to include the dataset name along with the table when you connect to a remote table. It's hard to be sure without knowing more about the structure of your database(s).