monitoringspecificationsnagioshealth-monitoring

Spec for http health checks?


I want to implement a simple health check and make it available via http.

Up to now I have only experience writing nagios plugins. Nagios has this API spec

Is there already a common way how to write vendor-neutral health checks?

If not, what should a sane health check return to make it portable to many different monitoring server implementations?


Solution

  • Although there is no standard for format of health checks, you should consider major monitoring tools and their expectations from your protocol. In most cases they react to specific HTTP answer codes. For example Amazon Route 53:

    waits for an HTTP status code of 200 or greater and less than 400

    Another tool, Consul, has more specific definition:

    The status of the service depends on the HTTP response code: any 2xx code is considered passing, a 429 Too Many Requests is a warning, and anything else is a failure.

    So you might need to check a few top tools you might integrate later and choose an approach suitable for all of them.