I've deployed a new instance of Nagios on a fresh install of CentOS 7 via the EPEL repository. So the Nagios Core version is 3.5.1.
After installing nagios and nagios-plugins-all (via yum), I've created a number of hosts and service definitions, have tested my configuration with nagios -v /etc/nagios/nagios.cfg
, and have Nagios up and running!
Unfortunately, my host checks are failing (although my service checks are working perfectly fine).
Within the Nagios Web GUI / Dashboard, if I drill down into a Host page with the "Host State Information", I see this being reported for "Status Information" (IP address removed):
Status Information: /usr/bin/ping -n -U -w 30 -c 5 {my-host-ip-address}
CRITICAL - Could not interpret output from ping command
So in my troubleshooting, I drilled down into the Nagios Plugins directory (/usr/lib64/nagios/plugins), and ran a test with the check_ping plugin consistent with the way check-host-alive runs the command (see below for my check-host-alive command definition):
./check_ping -H {my-ip-address} -w 3000.0,80% -c 5000.0,100% -p 5
This check_ping command returns the following output:
PING OK - Packet loss = 0%, RTA = 0.63 ms|rta=0.627000ms;3000.000000;5000.000000;0.000000 pl=0%;80;100;0
I haven't changed the definition of how check_ping works, and can confirm that I'm getting a "PING OK" whenever the command is run the same way that check-host-alive runs the command, so I cannot figure out what's going on!
Below are the command definitions for check-host-alive as well as check_ping.
# 'check-host-alive' command definition
define command{
command_name check-host-alive
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
}
{snip}
# 'check_ping' command definition
define command{
command_name check_ping
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5
}
Any suggestions on how I can fix my check-host-alive command definition to work properly and evaluate the output of check_ping properly?
Edit
Below is the full define host {} template I'm using:
define host {
host_name myers ; The name of this host template
alias Myers
address [redacted]
check_command check-host-alive
contact_groups admins
notifications_enabled 0 ; Host notifications are enabled
event_handler_enabled 1 ; Host event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
failure_prediction_enabled 1 ; Failure prediction is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
notification_period 24x7 ; Send host notifications at any time
register 1
max_check_attempts 2
}
I was fairly certain that running chmod U+s /usr/bin/ping
would solve the issue, but I was (and still am) wary about chmod'ing system files. It seems to me that there has to be a safer way to do it.
However, in the end, that's what I did - and it works. I don't like it, from a security standpoint.