I've seen this question asked elsewhere but I don't think it quite addresses the issue I'm having.
We're using a module to create an assumeable role:
module "service_task_role" {
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role"
version = "5.32.0"
create_custom_role_trust_policy = true
create_role = true
role_name = "${local.full_name}-task-role"
role_requires_mfa = false
custom_role_trust_policy = data.aws_iam_policy_document.task_trust_policy_document.json
custom_role_policy_arns = concat(
[var.task_policy_arn],
[for rule in module.rds_security_group_rules : rule.policy_arn]
)
number_of_custom_role_policy_arns = 1 + (var.database_connection == null ? 0 : 1)
tags = module.service_iam_label.tags
}
This worked fine until we realized that occasionally we needed a role created without a task_policy_arn
. I modified this code to look like this:
locals {
has_task_policy = var.task_policy_arn != null
num_custom_policies = (local.has_task_policy ? 1 : 0) + (var.database_connection == null ? 0 : 1)
}
module "service_task_role" {
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role"
version = "5.32.0"
create_custom_role_trust_policy = true
create_role = true
role_name = "${local.full_name}-task-role"
role_requires_mfa = false
custom_role_trust_policy = data.aws_iam_policy_document.task_trust_policy_document.json
custom_role_policy_arns = concat(
local.has_task_policy ? [var.task_policy_arn] : [],
[for rule in module.rds_security_group_rules : rule.policy_arn]
)
number_of_custom_role_policy_arns = local.num_custom_policies
tags = module.service_iam_label.tags
}
and now the plan fails. Looking out a bit wider, this code is part of a module which creates an ECS task. We typically call this module like so:
module "service_policy" {
source = "terraform-aws-modules/iam/aws//modules/iam-policy"
version = "5.32.0"
count = local.create_service_policy ? 1 : 0
name = "service-${var.application}-policy"
path = "/"
description = "Policy for the ${title(var.application)} service"
policy = jsonencode({
"Version" : "2012-10-17",
"Statement" : concat(var.parameter_store_access || local.has_local_secrets ? [
{
"Action" : [
"ssm:GetParameter",
"ssm:GetParameters",
"ssm:GetParametersByPath",
],
"Effect" : "Allow",
"Resource" : [
"arn:aws:ssm:${var.region}:${local.aws_account_id}:parameter/*",
]
}
] : [],
local.secrets_manager_config,
var.custom_policies,
)
})
tags = module.service_label.tags
}
module "shared_ecs" {
source = "app.terraform.io/My-Org/aws//modules/ecs"
version = "2.0.7"
# Define the core environment variables
region = var.region
environment = var.environment
# Define the details associated with the load balancer and networking
listener_priority = var.priority
service_port = var.port
slow_start = var.slow_start
short_name = var.short_name
health_check = local.health_check_endpoint
# Define the scopes for the API
scopes = var.scopes
# Define the details associated with the ECS task
task_policy_arn = local.create_service_policy ? module.service_policy[0].arn : null
desired_count = local.instance_count
cpu = var.cpu
memory = var.memory
# Define the RDS instance details if provided
database_connection = var.database_connection
}
In this code, we should know the condition of module.service_task_role[0]
once the Terraform plan completes and local.create_service_policy
only depends on variable inputs. Since this is the top-level module, those are all constant. Thus, I don't think I should be seeing an error but when I try to do a plan I get the following:
╷
│ Error: Invalid count argument
│
│ on .terraform/modules/service.shared_ecs.service_task_role/modules/iam-assumable-role/main.tf line 165, in resource "aws_iam_role_policy_attachment" "custom":
│ 165: count = var.create_role ? coalesce(var.number_of_custom_role_policy_arns, length(var.custom_role_policy_arns)) : 0
│
│ The "count" value depends on resource attributes that cannot be determined
│ until apply, so Terraform cannot predict how many instances will be
│ created. To work around this, use the -target argument to first apply only
│ the resources that the count depends on.
╵
I thought I could create two instances of service_task_role
, one including var.task_policy_arn
and one without it but that would also involve the use of count
so it seems like that would result in the same issue.
EDIT I was able to fix the error by adding a moved block for all the existing callers of this module:
moved {
from = module.service["some_service"].module.service_policy.aws_iam_policy.policy[0]
to = module.service["some_service"].module.service_policy[0].aws_iam_policy.policy[0]
}
Now I'm even more confused.
Does anyone know why this is happening and what I can do to resolve the issue?
The main essence of the change you made is that the number of elements in custom_role_policy_arns
is now decided based on whether var.task_policy_arn
is null. This means that Terraform can only determine the length of that collection once the "nullness" of the task policy ARN has been decided.
Unfortunately, Terraform's support for tracking whether a not-yet-decided value could potentially be null is relatively recent (Terraform v1.6 in late 2023) and so large providers like hashicorp/aws
have not been fully updated to be able to tell Terraform whether their exported values are nullable or not.
The most likely explanation for the behavior you encountered, then, is that var.task_policy_arn
is derived from the arn
attribute of some resource instance in the service_policy
module that hasn't been created yet, and so the provider is reporting that this value (and its nullness) won't be decided until the apply phase, once the remote object has been created.
If you know that in practice this ARN value cannot possibly be null (which is true for most arn
attributes in the hashicorp/aws
provider) then until the provider is updated to report this properly itself you can work around it by giving Terraform some more information to help it to better understand the situation.
For example, in the module "shared_ecs"
block you could write the definition of task_policy_arn
like this:
task_policy_arn = (
local.create_service_policy ?
coalesce(module.service_policy[0].arn) :
null
)
By definition the coalesce
function can never produce a null value -- it returns an error if all of the given arguments are null -- and so Terraform automatically infers that any value derived from the result of that function cannot be null even if the value is otherwise unknown.
This means that inside the shared_ecs
module Terraform can assume that var.task_policy_arn
is not null, meaning that:
var.task_policy_arn != null
can return true
instead of an unknown value, so...local.has_task_policy
will have the known value true
, and so...concat
in the custom_role_policy_arns
definition in module "service_task_role"
definitely has exactly one element, and so...length(var.custom_role_policy_arns)
in the count
argument can return 1, instead of an unknown number, which should therefore avoid this error message.There's some more background context on the current situation in hashicorp/terraform-plugin-framework#869, in case that's interesting. That issue discusses implementation details behind the problem rather than the surface-level problem itself, but I'm linking to it here just because I think that's the most likely place that any progress on this problem would be reported, so if someone finds this question in the far future they can learn if my answer is still valid or if the situation has changed in the meantime.