terraformterraform-provider-databricks

Terraform loop over Map variable to provision multiple Databricks catalogs


I'm following this example to provision multiple Databricks catalogs (see provider method) using for_each, however terraform doesn't detect any differences from the state. The provider method definitely works - rather it seems the issue is the HCL syntax to perform a loop, which is not noticing anything.

variables.tf (where the configuration is declared):

variable "catalog" {
  type = map(object({
    catalog_grants         = optional(map(list(string)))
    catalog_owner          = optional(string)         # Username/groupname/sp application_id of the catalog owner.
    catalog_storage_root   = optional(string)         # Location in cloud storage where data for managed tables will be stored
    catalog_isolation_mode = optional(string, "OPEN") # Whether the catalog is accessible from all workspaces or a specific set of workspaces. Can be ISOLATED or OPEN.
    catalog_comment        = optional(string)         # User-supplied free-form text
    catalog_properties     = optional(map(string))    # Extensible Catalog Tags.
    schema_name            = optional(list(string))   # List of Schema names relative to parent catalog.
    schema_grants          = optional(map(list(string)))
    schema_owner           = optional(string) # Username/groupname/sp application_id of the schema owner.
    schema_comment         = optional(string)
    schema_properties      = optional(map(string))
  }))
  description = "Map of catalog name and its parameters"
  default = {
    catalog = {
      example_catalog1 = {
        catalog_grants = {
          "example@username.com" = ["USE_CATALOG", "USE_SCHEMA", "CREATE_SCHEMA", "CREATE_TABLE", "SELECT", "MODIFY"]
        }
        schema_name = ["raw", "refined", "data_product"]
      }
      example_catalog2 = {
        catalog_grants = {
          "example@username.com" = ["USE_CATALOG", "USE_SCHEMA", "CREATE_SCHEMA", "CREATE_TABLE", "SELECT", "MODIFY"]
        }
        schema_name = ["raw", "refined", "data_product"]
      }
    }
  }
}

main.tf (where the loop is performed)

resource "databricks_catalog" "this" {
  for_each = var.catalog

  name           = each.key
  owner          = each.value.catalog_owner
  storage_root   = each.value.catalog_storage_root
  isolation_mode = each.value.catalog_isolation_mode
  comment        = lookup(each.value, "catalog_comment", "default comment")
  properties     = lookup(each.value, "catalog_properties", {})
  force_destroy  = true
}

The only required argument for the databricks_catalog resource is name, which is provided as the key in the dictionary above.

On the other hand, provisioning a catalog without the loop works:

resource "databricks_catalog" "sandbox" {
  name    = "example_catalog3"
}

Solution

  • In the variable block for the default value, you have provided an incorrect structure. You have provided a map of a map with catalogs. To solve your issue you need to remove the top-level map, which is unnecessary for your use case. After removing that map your variable will look like this:

    variable "catalog" {
      type = map(object({
        catalog_grants         = optional(map(list(string)))
        catalog_owner          = optional(string)         # Username/groupname/sp application_id of the catalog owner.
        catalog_storage_root   = optional(string)         # Location in cloud storage where data for managed tables will be stored
        catalog_isolation_mode = optional(string, "OPEN") # Whether the catalog is accessible from all workspaces or a specific set of workspaces. Can be ISOLATED or OPEN.
        catalog_comment        = optional(string)         # User-supplied free-form text
        catalog_properties     = optional(map(string))    # Extensible Catalog Tags.
        schema_name            = optional(list(string))   # List of Schema names relative to parent catalog.
        schema_grants          = optional(map(list(string)))
        schema_owner           = optional(string) # Username/groupname/sp application_id of the schema owner.
        schema_comment         = optional(string)
        schema_properties      = optional(map(string))
      }))
      description = "Map of catalog name and its parameters"
      default = {
        # HERE I REMOVED THE MAP "CATALOG"
        example_catalog1 = {
          catalog_grants = {
            "example@username.com" = ["USE_CATALOG", "USE_SCHEMA", "CREATE_SCHEMA", "CREATE_TABLE", "SELECT", "MODIFY"]
          }
          schema_name = ["raw", "refined", "data_product"]
        }
        example_catalog2 = {
          catalog_grants = {
            "example@username.com" = ["USE_CATALOG", "USE_SCHEMA", "CREATE_SCHEMA", "CREATE_TABLE", "SELECT", "MODIFY"]
          }
          schema_name = ["raw", "refined", "data_product"]
        }
      }
    }
    

    As a validation I've run the Terraform plan and I got 2 resources as expected:

    Terraform will perform the following actions:
    
      # databricks_catalog.this["example_catalog1"] will be created
      + resource "databricks_catalog" "this" {
          + force_destroy  = true
          + id             = (known after apply)
          + isolation_mode = "OPEN"
          + metastore_id   = (known after apply)
          + name           = "example_catalog1"
          + owner          = (known after apply)
        }
    
      # databricks_catalog.this["example_catalog2"] will be created
      + resource "databricks_catalog" "this" {
          + force_destroy  = true
          + id             = (known after apply)
          + isolation_mode = "OPEN"
          + metastore_id   = (known after apply)
          + name           = "example_catalog2"
          + owner          = (known after apply)
        }
    
    Plan: 2 to add, 0 to change, 0 to destroy.