I have set up a Cloudwatch Metric to watch a log file:
resource "aws_cloudwatch_log_metric_filter" "log_errors" {
name = "${local.fullname}-log-errors"
log_group_name = "/aws/lambda/${local.fullname}"
pattern = "{ $._logLevel = \"error\" }"
metric_transformation {
name = "${local.fullname}-error-count"
namespace = "MyApp"
value = "1"
}
}
I can see the metric is working - note the dot at 13:15 below (me manually creating a log entry to test):
And an alarm to fire if the metric reports 1 or more events within a minute:
resource "aws_cloudwatch_metric_alarm" "log_errors_alarm" {
alarm_name = "${local.fullname}-log-errors"
alarm_description = "log.error() count for MyApp lambda ${local.fullname}"
metric_name = "${local.fullname}-error-count"
threshold = "0"
statistic = "Sum"
unit = "Count"
comparison_operator = "GreaterThanThreshold"
datapoints_to_alarm = "1"
evaluation_periods = "1"
period = "60"
namespace = "MyApp"
treat_missing_data = "notBreaching"
alarm_actions = [data.aws_ssm_parameter.sns_topic_arn.value]
ok_actions = [data.aws_ssm_parameter.sns_topic_arn.value]
}
But despite the metric having an event (per above) the alarm is never fired:
I'm unsure how to debug this, as all the AWS resources are created successfully, errors that I create manually are passed to the metric, and I'm using a very similar alarm config in other lambdas successfully, where it throws alarms.
Why is my metric working but my alarm not alarming?
I'd put my money on the Unit being inconsistent between the metric_alarm
and metric_filter
.
You're setting the unit
on the metric_alarm
to be Count
, but you're not setting a unit
on the metric_filter
's metric_transformation
, so the metric_transformation
will default to None
.
Try setting the unit
in the alarm to None
or removing unit
altogether.