amazon-web-servicesjenkinsgrafanaalertcicd

Limit Restart Attempts in Self-Healing Process for Application on EC2


The idea is that it would be default behaviour for "self healing" or "restart" procedure:

The applicaiton is down -> Grafana Alert triggers a webhook -> Jenkins job restarts the application

But here is the problem:

To prevent nonstop restart of the application i want somehow implement that it would be up to 3 restarts after which it will send the notification to the engineer by email and stop restarting the application. And in case after 3 restarts and notification, the application got fixed by engineer, these 3 tries would be renewed and "self healing" mechanism would be ready to restart the application again 3 times

But i dont know how to implement 3 restarts limit on this


Solution

  • If your question is strictly about persistent state then Jenkins can do it, kind of. You can save stuff in env:

    env.matrjoshka = 42
    

    And then pick it up in the next run:

    def prev_env = currentBuild.previousBuild.buildVariables
    env.matrjoshka = prev_env.matrjoshka + 1
    

    Whether the fun of coding your own monitoring is worth it is entirely up to you, of course, but there are existing tools for that.