I have a fresh install of Hortonworks Data Platform 2.2 installed on a small cluster (4 machines) but when I login to the Ambari GUI, the majority of dashboard stats boxes (HDFS disk usage, Network usage, Memory usage etc) are not populated with any statistics, instead they show the message:
No data There was no data available. Possible reasons include inaccessible Ganglia service
Clicking on the HDFS service link gives the following summary:
NameNode Started
SNameNode Started
DataNodes 4/4 DataNodes Live
NameNode Uptime Not Running
NameNode Heap n/a / n/a (0.0% used)
DataNodes Status 4 live / 0 dead / 0 decommissioning
Disk Usage (DFS Used) n/a / n/a (0%)
Disk Usage (Non DFS Used) n/a / n/a (0%)
Disk Usage (Remaining) n/a / n/a (0%)
Blocks (total) n/a
Block Errors n/a corrupt / n/a missing / n/a under replicated
Total Files + Directories n/a
Upgrade Status Upgrade not finalized
Safe Mode Status n/a
The Alerts and Health Checks box to the right of the screen is not displaying any information but if I click on the settings icon this opens the Nagios frontend and again, everything looks healthy here!
The install went smoothly (CentOS 6.5) and everything looks good as far as all services are concerned (all started with green tick next to service name). There are some stats displayed on the dashboard: 4/4 datanodes are live, 1/1 Nodemanages live & 1/1 Supervisors are live. I can write files to HDFS so its looks like it's a Ganglia issue?
The Ganglia daemon seems to be working ok:
ps -ef | grep gmond
nobody 1720 1 0 12:54 ? 00:00:44 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPHistoryServer/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPHistoryServer/gmond.pid
nobody 1753 1 0 12:54 ? 00:00:44 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPFlumeServer/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPFlumeServer/gmond.pid
nobody 1790 1 0 12:54 ? 00:00:48 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPHBaseMaster/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPHBaseMaster/gmond.pid
nobody 1821 1 1 12:54 ? 00:00:57 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPKafka/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPKafka/gmond.pid
nobody 1850 1 0 12:54 ? 00:00:44 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPSupervisor/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPSupervisor/gmond.pid
nobody 1879 1 0 12:54 ? 00:00:45 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPSlaves/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPSlaves/gmond.pid
nobody 1909 1 0 12:54 ? 00:00:48 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPResourceManager/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPResourceManager/gmond.pid
nobody 1938 1 0 12:54 ? 00:00:50 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPNameNode/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPNameNode/gmond.pid
nobody 1967 1 0 12:54 ? 00:00:47 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPNodeManager/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPNodeManager/gmond.pid
nobody 1996 1 0 12:54 ? 00:00:44 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPNimbus/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPNimbus/gmond.pid
nobody 2028 1 1 12:54 ? 00:00:58 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPDataNode/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPDataNode/gmond.pid
nobody 2057 1 0 12:54 ? 00:00:51 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPHBaseRegionServer/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPHBaseRegionServer/gmond.pid
I have checked the Ganglia service on each node, the processes are running as expected
ps -ef | grep gmetad
nobody 2807 1 2 12:55 ? 00:01:59 /usr/sbin/gmetad --conf=/etc/ganglia/hdp/gmetad.conf --pid-file=/var/run/ganglia/hdp/gmetad.pid
I have tried restarting Ganglia services with no luck, restarted all services but still the same. Does anyone have any ideas how I get the dashboard to work properly? Thank you.
It turns out to be a proxy issue, to access the internet I had to add my proxy details to the file /var/lib/ambari-server/ambari-env.sh
export AMBARI_JVM_ARGS=$AMBARI_JVM_ARGS' -Xms512m -Xmx2048m -Dhttp.proxyHost=theproxy -Dhttp.proxyPort=80 -Djava.security.auth.login.config=/etc/ambari-server/conf/krb5JAASLogin.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false'
When ganglia was trying to access each node in the cluster the request was going via the proxy and never resolving, to overcome the issue I added my nodes to the exclude list (add the flag -Dhttp.nonProxyHosts) like so:
export AMBARI_JVM_ARGS=$AMBARI_JVM_ARGS' -Xms512m -Xmx2048m -Dhttp.proxyHost=theproxy -Dhttp.proxyPort=80 -Dhttp.nonProxyHosts="localhost|node1.dms|node2.dms|node3.dms|etc" -Djava.security.auth.login.config=/etc/ambari-server/conf/krb5JAASLogin.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false'
After adding the exclude list the stats were retrieved as expected!