amazon-web-servicesaws-elb

Monitoring unhealthy targets of Application Load Balancer


I'm trying to create a solution to monitor Application Load Balancer targets health check and in case of failure create an OpsGenie alert. I have followed AWS documentation. But I noticed that in some cases, when the second target HealthCheck failed, and the first target is not fixed, lambda will not be called again, because Alarm is still "in Alarm".
enter image description here
What possible changes could be made to call lambda every time the new target is failed?

I have tried to add separate Alarm for every target, but it's not suitable solution for me because i have a lot of targets in Application Load Balancer, and every time i add new i need to create Alarm for it.


Solution

  • Here is my suggestion,

    1. Make sure to set the Targets Health Check Interval as 2 mintues
    2. Use UnhealthyHostCount metric to trigger CloudWatch Alarm
    3. Now set 2 things
      • Use the CloudWatch Alarm check period as 10 seconds like that
      • Trigger alarm when there is unhealth targets
      • Then add Additional configurations -> Missing data treatment in Cloudwatch alarm and set Treating missing data as good
    4. Then add the SNS notification for the InAlarm Trigger
    5. Finally create the Alarm

    Here is how it works,

    We are checking the HealthCheck status every 2 minutes, but the Alarm checks for the metric every 10 seconds are so. Since there is not data in-between the Alarm treats it as missing data. Since we have also set the missing data as good data, the Alarm will turn to green. After 2 minutes, the Alarm will go on if the Target is unhealth again.