rerror-handlinggoogle-bigquerybigrquery

R bigrquery - how to catch error messages from executed SQL?


Say I have some SQL code that refreshes a table of data, and I would like to schedule an R script to schedule this code to run daily. Is there a way to capture any potential error messages the SQL code may throw and save that error message to an R variable instead of the error message being displayed in the R console log?

For an example, assume I have stored procedure sp_causing_error() in BigQuery that that takes data from a source table source_table and refreshes a target table table_to_refresh.

CREATE OR REPLACE PROCEDURE sp_causing_error()
BEGIN

CREATE OR REPLACE TABLE table_to_refresh AS (
   Select non_existent_column, x, y, z
   From source_table
);

END;

Assume the schema of the source_table has changed and column non_existent_column no longer exists. When attempting to call sp_causing_error() in RStudio via:

library(bigrquery)

query <- "CALL sp_causing_error()"

bq_project_query(my_project, query)

We get an error message printed to the console (which masks the actual error message we would encounter if running in BigQuery):

Error in UseMethod("as_bq_table") : no applicable method for 'as_bq_table' applied to an object of class "NULL"

If we were to run sp_causing_error() in BigQuery, it throws an error message stating:

Query error: Unrecognized name: non_existent_column at [sp_throw_error:3:8]

Are query error message displayed in BigQuery ever captured anywhere in bigrquery when executing SQL? My goal would be to have some sort of try/catch block in the R script that catches an error message that can then be written to an output file if the SQL code did not run successfully. Hoping there is a way we can capture the descriptive error message from BigQuery and assign it to an R variable for further processing.

UPDATE

R's tryCatch() function comes in handy here to catch the R error message:

query <- "CALL sp_causing_error()"

result <- tryCatch(
  bq_project_query("research-01-217611", query),
  error = function(err) {
    return(err)
  }
)

result now contains the error message from the R console:

<simpleError in UseMethod("as_bq_table"): no applicable method for 'as_bq_table' applied to an object of class "NULL">

However, this is still not descriptive of the actual error message we see if we execute the same SQL code in BigQuery, quoted above which references an unrecognized column name. Are we able to catch that error message instead of the more generic R error message?


Solution

  • UPDATE/ANSWER

    Wrapping the stored procedure call within R using BigQuery's Begin...Exception...End syntax lets us get at the actual error message. Example code snippet:

    query <- '
    BEGIN
        CALL sp_causing_error();
    EXCEPTION WHEN ERROR THEN
        Select 1 AS error_flag, @@error.message AS error_message, @@error.statement_text AS error_statement_text, @@error.formatted_stack_trace AS stack_trace
        ;
    END;
    '
    
    query_result <- bq_table_download(bq_project_query(<project>, query))
    
    error_flag <- query_result["error_flag"][[1]]
    
    if (error_flag == 0) {
        print("Job ran successfully")
    } else {
        print("Job failed")
        # Access error message variables here and take additional action as desired
    }
    

    Warning: Note that this solution could cause an R error if the stored procedure completes successfully, as error_flag will not exist unless explicitly passed at the end of the stored procedure. This can be worked around by adding one line at the end of your stored procedure in BigQuery to set the flag appropriately so the bq_table_download() function will get a value upon the stored procedure running successfully:

    BEGIN
    -- BigQuery stored procedure code
    -- ...
    -- ...
    Select 0 AS error_flag;
    END;