I have a Hadoop cluster of three containers on three different hosts; the details are in following: First, I install "weave net" in my three hosts (150.20.11.133,150.20.11.136,150.20.11.157) via this command:
sudo curl -L git.io/weave -o /usr/local/bin/weave
sudo chmod a+x /usr/local/bin/weave
eval $(weave env)
Then I connected three host together via weave. In fact, I ran this command in three hosts:
For example in 150.20.11.133:
Weave launch 150.20.11.136 150.20.11.157
After connecting three hosts together, I had to make SSH passwordless between Master and Workers. Therefore, I did there works: In each host:
ssh-keygen -t rsa
In master:
ssh-copy-id spark@172.28.10.136
ssh-copy-id spark@172.28.10.157
cat /home/user/.ssh/id_rsa.pub >> /home/user/.ssh/authorized_keys
As a result of that, I could run SSH without password from master host to slaves.
In each host, I built my Docker file which had configuration for hadoop then I ran that:
In Master:
docker run -v /home/user/.ssh:/root/.ssh --privileged -p 52222:22
-e WEAVE_CIDR=10.32.0.1/12 -ti my-hadoop
In slave1:
docker run -v /home/user/.ssh:/root/.ssh --privileged -p 52222:22
-e WEAVE_CIDR=10.32.0.2/12 -ti my-hadoop
In slave2:
docker run -v /home/user/.ssh:/root/.ssh --privileged -p 52222:22
-e WEAVE_CIDR=10.32.0.3/12 -ti my-hadoop
In each container, I ran these commands:
chmod 700 ~/.ssh/
chmod 600 ~/.ssh/*
chown -R root ~/.ssh/
chgrp -R root ~/.ssh/
chmod -R 750 /root/.ssh/authorized_keys
In master container, I ran this command to run SSH localhost:
ssh-keygen -f "/root/.ssh/known_hosts" -R localhost
Also, I started SSH service in each container node:
service ssh restart
So, I could run SSH from master container to slaves without password. For hadoop configuration, I did these works: First in master node:
hadoop namenode -format
workers had these contents in three contaiers:
root@10.32.0.2
root@10.32.0.3
core-site.xml had this contents in three containers:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://root@10.32.0.1:9000</value>
</property>
</configuration>
hdfs-site.xml had these contents in three containers too:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
Then I ran this in master container:
/opt/hadoop/sbin/start-dfs.sh
When I ran jps in each container, I received these results: In Master container:
483 SecondaryNameNode
231 NameNode
747 Jps
In each Worker:
117 DataNode
186 Jps
The problem is, I want to see Hadoop UI in browser. I run this URL, but it does not show anything:
http://10.32.0.1:8088
By the way, I have already exposed these ports in docker file:
EXPOSE 22 9000 8088 50070 50075 50030 50060
Would you please tell me how I can see Hadoop cluster UI in browser?
Any help would be appreciated.
I could see datanodes in browser by adding these lines in hdfs-site.xml.
<property>
<name>dfs.http.address</name>
<value>10.32.0.1:50070</value>
</property>
<property>
<name>dfs.secondary.http.address</name>
<value>10.32.0.1:50090</value>
</property>
Hope it was helpful.