I'm working on a R Shiny app that used to load the data from a CSV file and display it on a plot. The user had the ability to do some filtering via dropdowns and dateRangeInput selectors.
My goal is to scrap the CSV file and connect directly to BigQuery (which is where the original data table is located). Most importantly, I don't want to download the data from BigQuery into R's memory as it would give no improvement in terms of efficiency. To do so, I have had to do a lot of re-coding for the filtering, transforming code written in base R to dbplyr verbs.
It all seems to work until I get to filtring on the dates. Whenever I add the conditon on the dates, I get an error saying "Error in .transformer: value
must be a string or scalar SQL, not the number 1."
This is a minimal example that gives me the error:
library(shiny)
library(dbplyr)
library(tidyverse)
library(ggplot2)
library(plotly)
library(bigrquery)
choices <- c(1, 2)
bq_conn <- dbConnect(
bigrquery::bigquery(),
project = "project_name",
dataset = "dataset_name",
billing ="billing_name"
)
# read the data
df <- tbl(bq_conn, "10_C_app_input_data") %>%
filter(atc_direction == 1)
ui <- fluidPage(
sidebarLayout(
sidebarPanel(
width = 2,
dateRangeInput(inputId = "dates",
label = "Period to view",
start = "2023-10-01",
end = "2024-09-30",
min = "2023-10-01",
max = "2024-09-30")
),
mainPanel(
plotlyOutput("testplot",
height = "70vh",
width = "80vw")
)
)
)
server <- function(input, output, session) {
visual_filter <- reactive({
df %>%
filter(
& (datefull >= input$dates[1])
& (datefull <= input$dates[2])
)
})
output$testplot <- renderPlotly({
full_ggplot <- ggplot(visual_filter()) +
geom_line(aes(x = datefull,
y = flow,
linetype = flow_type,
color = flow_type))
})
}
shinyApp(ui, server)
I have tried doing the filtering without a reactive and this seems to show the same behaviour - all works unless I try to filter by dates.
I should also add that I have tried converting both datefull
and input$dates[i]
to date using as.Date
but this has had no effect.
Edit
As per comments, I have removed the direction selector as this works fine. I have also tried the following steps:
input$dates
with actual values in the form of strings (i.e. having something like datefull >= "2023-10-01"
) and this works fine but of course this is not the desired behaviourlubridate
converters to the filtering statement (i.e. like ymd(datefull) >= ymd(input$dates[1])
) but again this is throwing the dame error. I have also tried applying ymd
or lubridate::as_date
directly to the column datefull in a mutate
verb when I read the data but I get the same error.Note that I have also separately printed the content of input$dates
into a htmlOutput on the app panel and I can confirm it's dates of the desired format yyyy-mm-dd.
Assuming that datefull >= "2023-10-01"
produces filtering as expected (just not dynamic). Then as Ifeanyi Idiaye commented, this is most likely due to differences in data formats.
dateRangeInput
returns variables of type date.The exact solution depends on the BigQuery data type. Note that you may not be able to judge this just from visual inspection: 2024-12-13
looks a lot like '2024-12-13'
but one may be stored as date and the other stored as character.
You can try something like the following to see how R interprets the data type, but this may differ from the BigQuery date type:
df %>%
select(datefull) %>%
head() %>%
collect() %>%
pull() %>%
class()
If the datefull
is stored in BigQuery as date, then the problem is with dbplyr translation. I would test this using:
df %>%
filter(
(datefull >= input$dates[1])
& (datefull <= input$dates[2])
) %>%
show_query()
In the worst case I would simplify filter
down to create the minimum translation:
visual_filter <- reactive({
sdate = input$dates[1]
edate = input$dates[2]
out = filter(df, datefull >= sdate)
out = filter(out, datefull <= edate)
out
})
If the datefull
is stored in BigQuery as character in YYYY-MM-DD
format, then I would try: datefull >= as.character(input$dates[1])
. This means the comparison is occurring in text format, based on alphabetic sorting. Alphabetic sorting of text in the format YYYY-MM-DD
is equivalent to numeric sorting.
If the datefull
is stored in BigQuery as character in a different format, then you need to figure out a way to change its format. Perhaps something like: sql(CAST(datefull AS DATE)) >= input$dates[1]
.
If the datefull
is stored in BigQuery as numeric - it is probably 'days since a reference date'. You will need to find out what the reference date is. You then have the choice of converting input$dates
to 'days since reference date' or converting the datefull to date.