google-cloud-platformgoogle-cloud-monitoringgoogle-cloud-load-balancergoogle-cloud-error-reporting

Getting alerts for a TCP LB whenever there's an unhealthy target?


I am currently using an unmanaged TCP Load Balancer that has 3 target VMs, and to provide quick response I will need an alerting system whenever the healthy amount is not 3 out of 3 VMs

Is there a way to get alerts about this through e-mail, slack, or pagerduty in GCP?


Solution

  • It's possible to create such an alert that will alert you when your one of the instances in your group stops working properly.

    Go your Unmanaged instance group details page and switch the tab to "Monitoring":

    enter image description here

    Click on Create alerting policy and you will see another panel:

    enter image description here

    At the bottom of this screen change the Condition to is below and Threshold to 3 as shown below.

    enter image description here

    You will find yourself at the Policy creation page: enter image description here

    Click Next and select desired notification channel, if you don't see any available click on Manage notification channels and create one you want, it can be email, SMS, Slack and many others.




    Another approach is to create an alert triggered by logs.

    First you need to create a health check (and enable logging). Then you go to your load balancers settings and edit your backend service, in there you select the health check you created.

    enter image description here

    Then go to Logs explorer and select as a log resource your instance group. You will see in the query editor something like this:

    resource.type="gce_instance_group" resource.labels.instance_group_id="3863333883516335882" resource.labels.instance_group_name="hc-group-1"

    then add at the bottom this line: jsonPayload.healthCheckProbeResult.healthState="UNHEALTHY"

    And then click "Run Query" which should result in a few logs that will contain logs that can be used to trigger an alert.

    enter image description here

    Now when you see the logs click on Actions and select "Create log alert":

    enter image description here

    You will see the window that will allow you to name the alert and select a proper channel to send notifications. I've just tested it (group of 2 VM's, after switching off either one of them triggered an alert) in a form of email:

    enter image description here

    Lastly - depending on the service you're running you can monitor many different services (in my case it was HTTP reply on port 80).