I have multiple aws_glue_catalog_table
resources and I want to create a single output
that loops over all resources to show the S3 bucket location of each one. The purpose of this is to test if I am using the correct location
(because it is a concatenation of variables) for each resource in Terratest. I cannot use aws_glue_catalog_table.*
or aws_glue_catalog_table.[]
because Terraform does not allow to reference a resource without specifying its name.
So I created a variable "table_names"
with r1
, r2
, rx
. Then, I can loop over the names. I want to create the string aws_glue_catalog_table.r1.storage_descriptor[0].location
dynamically, so I can check if the location
is correct.
resource "aws_glue_catalog_table" "r1" {
name = "r1"
database_name = var.db_name
storage_descriptor {
location = "s3://${var.bucket_name}/${var.environment}-config/r1"
}
...
}
resource "aws_glue_catalog_table" "rX" {
name = "rX"
database_name = var.db_name
storage_descriptor {
location = "s3://${var.bucket_name}/${var.environment}-config/rX"
}
}
variable "table_names" {
description = "The list of Athena table names"
type = list(string)
default = ["r1", "r2", "r3", "rx"]
}
output "athena_tables" {
description = "Athena tables"
value = [for n in var.table_names : n]
}
First attempt: I tried to create an output "athena_tables_location"
with the syntax aws_glue_catalog_table.${table}
but does does.
output "athena_tables_location" {
// HOW DO I ITERATE OVER ALL TABLES?
value = [for t in var.table_names : aws_glue_catalog_table.${t}.storage_descriptor[0].location"]
}
Second attempt: I tried to create a variable "table_name_locations"
but IntelliJ already shows an error ${t}
in the for loop [for t in var.table_names : "aws_glue_catalog_table.${t}.storage_descriptor[0].location"]
.
variable "table_name_locations" {
description = "The list of Athena table locations"
type = list(string)
// THIS ALSO DOES NOT WORK
default = [for t in var.table_names : "aws_glue_catalog_table.${t}.storage_descriptor[0].location"]
}
How can I list all table locations in the output
and then test it with Terratest?
Once I can iterate over the tables and collect the S3 location I can do the following test using Terratest:
athenaTablesLocation := terraform.Output(t, terraformOpts, "athena_tables_location")
assert.Contains(t, athenaTablesLocation, "s3://rX/test-config/rX",)
It seems like you have an unusual mix of static and dynamic here: you've statically defined a fixed number of aws_glue_catalog_table
resources but you want to use them dynamically based on the value of an input variable.
Terraform doesn't allow dynamic references to resources because its execution model requires building a dependency graph between all of the objects, and so it needs to know which exact resources are involved in a particular expression. However, you can in principle build your own single value that includes all of these objects and then dynamically choose from it:
locals {
tables = {
r1 = aws_glue_catalog_table.r1
r2 = aws_glue_catalog_table.r2
r3 = aws_glue_catalog_table.r3
# etc
}
}
output "table_locations" {
value = {
for t in var.table_names : t => local.tables[t].storage_descriptor[0].location
}
}
With this structure Terraform can see that output "table_locations"
depends on local.tables
and local.tables
depends on all of the relevant resources, and so the evaluation order will be correct.
However, it also seems like your table definitions are systematic based on var.table_names
and so could potentially benefit from being dynamic themselves. You could achieve that using the resource for_each
feature to declare multiple instances of a single resource:
variable "table_names" {
description = "Athena table names to create"
type = set(string)
default = ["r1", "r2", "r3", "rx"]
}
resource "aws_glue_catalog_table" "all" {
for_each = var.table_names
name = each.key
database_name = var.db_name
storage_descriptor {
location = "s3://${var.bucket_name}/${var.environment}-config/${each.key}"
}
...
}
output "table_locations" {
value = {
for k, t in aws_glue_catalog_table.all : k => t.storage_descriptor[0].location
}
}
In this case aws_glue_catalog_table.all
represents all of the tables together as a single resource with multiple instances, each one identified by the table name. for_each
resources appear in expressions as maps, so this will declare resource instances with addresses like this:
aws_glue_catalog_table.all["r1"]
aws_glue_catalog_table.all["r2"]
aws_glue_catalog_table.all["r3"]
Because this is already a map, this time we don't need the extra step of constructing the map in a local value, and can instead just access this map directly to build the output value, which will be a map from table name to storage location:
{
r1 = "s3://BUCKETNAME/ENVNAME-config/r1"
r2 = "s3://BUCKETNAME/ENVNAME-config/r2"
r3 = "s3://BUCKETNAME/ENVNAME-config/r3"
# ...
}
In this example I've assumed that all of the tables are identical aside from their names, which I expect isn't true in practice but I was going only by what you included in the question. If the tables do need to have different settings then you can change var.table_names
to instead be a variable "tables"
whose type is a map of object type where the values describe the differences between the tables, but that's a different topic kinda beyond the scope of this question, so I won't get into the details of that here.