I'm creating new cloudwatch alarms and while monitoring them some will into alarm for missing data or data breaching the threshold, but will never return to OK even though the metric data looks good. We have one alar that went into alarm state because of missing data.
The alarm has been configured to alarm when a 5 out of 5 data points are below a threshold of 3, with a 5 minute period. The graph shows that the data is being emitted and each data point from the last 7 or so points have been above the 3 threshold. Anyone have any ideas why the alarm hasn't reverted back to the OK state yet?
I'm tracking Allowed Requests from a WebACL. Looking at the average every 5 minutes. I can see the metric on the graph and all the data points look good.
After looking closely it seems like the Unit: Count
metric was causing the alarms to get stuck in the alarm state. Once I removed that metric (created alarms through a CloudFormation template) alarms went back to normal.