rshinydate-rangedbplyrbigrquery

R Shiny app connected to BigQuery cannot filter by dates


I'm working on a R Shiny app that used to load the data from a CSV file and display it on a plot. The user had the ability to do some filtering via dropdowns and dateRangeInput selectors.

My goal is to scrap the CSV file and connect directly to BigQuery (which is where the original data table is located). Most importantly, I don't want to download the data from BigQuery into R's memory as it would give no improvement in terms of efficiency. To do so, I have had to do a lot of re-coding for the filtering, transforming code written in base R to dbplyr verbs.

It all seems to work until I get to filtring on the dates. Whenever I add the conditon on the dates, I get an error saying "Error in .transformer: value must be a string or scalar SQL, not the number 1."

This is a minimal example that gives me the error:

library(shiny)
library(dbplyr)
library(tidyverse)
library(ggplot2)
library(plotly)
library(bigrquery)

choices <- c(1, 2)

bq_conn <- dbConnect(
  bigrquery::bigquery(),
  project = "project_name", 
  dataset = "dataset_name",
  billing ="billing_name"
)

# read the data
df <- tbl(bq_conn, "10_C_app_input_data") %>%
    filter(atc_direction == 1)


ui <- fluidPage(
  
  sidebarLayout(
    sidebarPanel(
      width = 2,
      
      dateRangeInput(inputId = "dates",
                     label = "Period to view",
                     start = "2023-10-01",
                     end = "2024-09-30",
                     min = "2023-10-01",
                     max = "2024-09-30")
    ),
  
    mainPanel(
      plotlyOutput("testplot", 
                   height = "70vh", 
                   width = "80vw")
    )
  )
)

server <- function(input, output, session) {
  
  
  visual_filter <- reactive({ 
      df %>% 
        filter(
          & (datefull >= input$dates[1])
          & (datefull <= input$dates[2])       
      )
  })
  

  output$testplot <- renderPlotly({

    full_ggplot <- ggplot(visual_filter()) +
      geom_line(aes(x = datefull,
                    y = flow,
                    linetype = flow_type,
                    color = flow_type)) 
  })  
}

shinyApp(ui, server)


I have tried doing the filtering without a reactive and this seems to show the same behaviour - all works unless I try to filter by dates.

I should also add that I have tried converting both datefull and input$dates[i] to date using as.Date but this has had no effect.


Edit

As per comments, I have removed the direction selector as this works fine. I have also tried the following steps:

Note that I have also separately printed the content of input$dates into a htmlOutput on the app panel and I can confirm it's dates of the desired format yyyy-mm-dd.


Solution

  • Assuming that datefull >= "2023-10-01" produces filtering as expected (just not dynamic). Then as Ifeanyi Idiaye commented, this is most likely due to differences in data formats.


    The exact solution depends on the BigQuery data type. Note that you may not be able to judge this just from visual inspection: 2024-12-13 looks a lot like '2024-12-13' but one may be stored as date and the other stored as character.

    You can try something like the following to see how R interprets the data type, but this may differ from the BigQuery date type:

    df %>%
      select(datefull) %>%
      head() %>%
      collect() %>%
      pull() %>%
      class()
    

    If the datefull is stored in BigQuery as date, then the problem is with dbplyr translation. I would test this using:

    df %>% 
      filter(
        (datefull >= input$dates[1])
        & (datefull <= input$dates[2])       
      ) %>%
      show_query()
    

    In the worst case I would simplify filter down to create the minimum translation:

    visual_filter <- reactive({ 
      sdate = input$dates[1]
      edate = input$dates[2]
    
      out = filter(df, datefull >= sdate)
      out = filter(out, datefull <= edate)
    
      out
    })
    

    If the datefull is stored in BigQuery as character in YYYY-MM-DD format, then I would try: datefull >= as.character(input$dates[1]). This means the comparison is occurring in text format, based on alphabetic sorting. Alphabetic sorting of text in the format YYYY-MM-DD is equivalent to numeric sorting.

    If the datefull is stored in BigQuery as character in a different format, then you need to figure out a way to change its format. Perhaps something like: sql(CAST(datefull AS DATE)) >= input$dates[1].

    If the datefull is stored in BigQuery as numeric - it is probably 'days since a reference date'. You will need to find out what the reference date is. You then have the choice of converting input$dates to 'days since reference date' or converting the datefull to date.