google-cloud-platformgoogle-bigquery

How to automatically back up and version BigQuery code such as stored procs?


What are some of the options to back up BigQuery DDLs - particularly views, stored procedure and function code?

We have a significant amount of code in BigQuery and we want to automatically back this up and preferably version it as well. Wondering how others are doing this.

Appreciate any help.

Thanks!


Solution

  • In order to keep and track our BigQuery structure and code, we're using Terraform to manage every resources in big query. More specifically to your question, We use google_bigquery_routine resource to make sure the changes are reviewed by other team members and every other benefit you get from working with VCS.

    Another important part of our TerraForm code is the fact we version our BigQuery module (via github releases/tags) that includes the Tables structure and Routines, version it and use it across multiple environments.

    Looks something like:

    main.tf

    module "bigquery" {
      source = "github.com/sample-org/terraform-modules.git?ref=0.0.2/bigquery"
    
      project_id       = var.project_id
    
    
      ...
      ... other vars for the module
      ... 
    }
    

    terraform-modules/bigquery/main.tf

    resource "google_bigquery_dataset" "test" {
        dataset_id = "dataset_id"
        project_id = var.project_name
    }
    
    resource "google_bigquery_routine" "sproc" {
      dataset_id = google_bigquery_dataset.test.dataset_id
      routine_id     = "routine_id"
      routine_type = "PROCEDURE"
      language = "SQL"
      definition_body = "CREATE FUNCTION Add(x FLOAT64, y FLOAT64) RETURNS FLOAT64 AS (x + y);"
    }
    

    This helps us upgrading our infrastructure across all environments without additional code changes