google-cloud-platformgoogle-bigqueryinfrastructure-as-codeidentity-management

BigQuery Policy Tags: possible as infra as code?


Like other cloud providers, Google Cloud Platform handles data access rights with a mish-mash of mechanisms.

GCP supports tags, IAM permissions and roles, etc. But it also supports BigQuery policy tags.

"Normal" tags can be managed via infra as code. For instance, here it is for Terraform.

But I cannot find any way to manage BigQuery policy tags via infra as code. Is this possible?

I don't need infra as code, per se, but I need the core guarantees that it provides for this use case:

How can I achieve this, with BigQuery policy tags? Any examples or documentation would be greatly appreciated!


Solution

  • Here is a recipe that I use to create and apply policy tags to tables via Terraform:

    1. Create Policy Tag
    # https://cloud.google.com/iam/docs/understanding-roles#datacatalog.categoryFineGrainedReader
    
    data "google_iam_policy" "unrestricted_finegrained_reader" {
      binding {
        role = "roles/datacatalog.categoryFineGrainedReader"
        members = [
          "allAuthenticatedUsers",
        ]
      }
    }
    
    resource "google_data_catalog_taxonomy" "basic_taxonomy" {
      display_name =  "my_taxonomy"
      description = "A collection of policy tags"
      region = "us"
    }
    
    resource "google_data_catalog_policy_tag" "date_policy" {
      taxonomy = google_data_catalog_taxonomy.basic_taxonomy.id
      display_name = "Date"
      description = "<Add Description>"
    }
    
    resource "google_data_catalog_policy_tag_iam_policy" "policy" {
      policy_tag = google_data_catalog_policy_tag.date_policy.name
      policy_data = data.google_iam_policy.unrestricted_finegrained_reader.policy_data
    }
    
    1. Create a dataset
    # https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/bigquery_dataset
    # Let's create a dataset
    resource "google_bigquery_dataset" "tf_dataset" {
        dataset_id = "tf_dataset"
        description = "Test dataset for Terraform"
        friendly_name = "tf_dataset"
        location = "us"
    }
    
    1. Create a table
    # https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/bigquery_table
    # Let's create a table!
    
    resource "google_bigquery_table" "table1" {
      dataset_id  = google_bigquery_dataset.tf_dataset.dataset_id
      table_id    = "table1"
      description = "Sample table"
      schema      = <<EOF
        [
      {
        "name": "col1",
        "type": "STRING",
        "mode": "NULLABLE",
        "description": "col1",
        "policyTags":{
            "names": [
              "${google_data_catalog_policy_tag.date_policy.id}"
              ]
          }
      },
      {
        "name": "col2",
        "type": "STRING",
        "mode": "NULLABLE",
        "description": "col2"
      }
    ]
        EOF
    }