nagios

Nagios Monitoring Hosts with check_ping


I've deployed a new instance of Nagios on a fresh install of CentOS 7 via the EPEL repository. So the Nagios Core version is 3.5.1.

After installing nagios and nagios-plugins-all (via yum), I've created a number of hosts and service definitions, have tested my configuration with nagios -v /etc/nagios/nagios.cfg, and have Nagios up and running!

Unfortunately, my host checks are failing (although my service checks are working perfectly fine).

Within the Nagios Web GUI / Dashboard, if I drill down into a Host page with the "Host State Information", I see this being reported for "Status Information" (IP address removed):

Status Information: /usr/bin/ping -n -U -w 30 -c 5 {my-host-ip-address}

CRITICAL - Could not interpret output from ping command

enter image description here

So in my troubleshooting, I drilled down into the Nagios Plugins directory (/usr/lib64/nagios/plugins), and ran a test with the check_ping plugin consistent with the way check-host-alive runs the command (see below for my check-host-alive command definition):

./check_ping -H {my-ip-address} -w 3000.0,80% -c 5000.0,100% -p 5

This check_ping command returns the following output:

PING OK - Packet loss = 0%, RTA = 0.63 ms|rta=0.627000ms;3000.000000;5000.000000;0.000000 pl=0%;80;100;0

I haven't changed the definition of how check_ping works, and can confirm that I'm getting a "PING OK" whenever the command is run the same way that check-host-alive runs the command, so I cannot figure out what's going on!

Below are the command definitions for check-host-alive as well as check_ping.

# 'check-host-alive' command definition
define command{
        command_name    check-host-alive
        command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
        }

{snip}

# 'check_ping' command definition
define command{
        command_name    check_ping
        command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5
        }

Any suggestions on how I can fix my check-host-alive command definition to work properly and evaluate the output of check_ping properly?

Edit

Below is the full define host {} template I'm using:

define host     {
        host_name                       myers    ; The name of this host template
        alias                           Myers
        address                         [redacted]
        check_command                   check-host-alive
        contact_groups                  admins
        notifications_enabled           0               ; Host notifications are enabled
        event_handler_enabled           1               ; Host event handler is enabled
        flap_detection_enabled          1               ; Flap detection is enabled
        failure_prediction_enabled      1               ; Failure prediction is enabled
        process_perf_data               1               ; Process performance data
        retain_status_information       1               ; Retain status information across program restarts
        retain_nonstatus_information    1               ; Retain non-status information across program restarts
        notification_period             24x7            ; Send host notifications at any time
        register                        1
        max_check_attempts              2
        }

Solution

  • I was fairly certain that running chmod U+s /usr/bin/ping would solve the issue, but I was (and still am) wary about chmod'ing system files. It seems to me that there has to be a safer way to do it.

    However, in the end, that's what I did - and it works. I don't like it, from a security standpoint.