hadoophigh-availability

When a connected namenode stops in Hadoop, doesn't the datanode automatically connect to another namenode?


We installed hadoop 2.2.0 on Ubuntu and configured HA as follows.

namenode: master, master-ha

datanode: slave

We confirmed that master and master-ah are connected properly and the status is set to active and standby, and we also confirmed that the slave data node is connected to the master server.

To check if failover worked, I killed the namenode on the master server and checked whether the hadoop status of master-ha changed to active. After checking, the status of master-ha was changed to active. However, the slave server's datanode continued to leave logs trying to connect to the master server's namenode.

I know the failover process for the namenode, but I'm not sure what happens to the datanode.

I thought the datanode should automatically connect to a live namenode, but I searched and it says that's not the case. Is this correct?

Also, what I'm curious about is that when the namenode is changed to active, normal input/output is possible. I think it's true that a connection error should not occur in the datanode. I'm curious what you think about this.


Solution

  • It's possible for a datanode to be in the middle of a replica-sync or other read/write operation, and have pending operations while the failover occurs, so yes, there will be temporary logs about connection errors...

    The datanodes' core-site.xml files need to reference a NameService and all Namenode addresses; they will not otherwise update automatically since the original NameNode is gone, and cannot notify them of any changes.

    You should consider using latest stable Hadoop 3.x instead, where the protocol may have been improved with network reconnections.