hadoophdfsambarihdp

how to identify the problem about under replica blocks


we installed small HDP cluster with one data-node machine

HDP version is 2.6.5 and ambari version is 2.6.1

so this is new cluster that contain two name-node with only one data-node ( worker machine )

the interesting behavior that we see is that increasing of under replica on ambari dashboard , for now the number is 15000 under replica blocks

as we know the most root cause of this problem is network issues between name node to data-node

but this isn't the case in our hadoop cluster

we can also decrease the under replica by the following procedure

su - <$hdfs_user>

bash-4.1$ hdfs fsck / | grep 'Under replicated' | awk -F':' '{print $1}' >> /tmp/under_replicated_files 

-bash-4.1$ for hdfsfile in `cat /tmp/under_replicated_files`; do echo "Fixing $hdfsfile :" ;  hadoop fs -setrep 3 $hdfsfile; done

but we not want to do it because under replica problem should not happens from beginning

and maybe need to tune some HDFS parameters , but we not sure about this

please let us know about any advice that can help us

enter image description here

enter image description here

enter image description here

enter image description here


Solution

  • If under replicated blocks problem happening since the install of the cluster, check following property:-

    dfs.replication
    

    This decided how many replica needs to be created, if you have one datanode then this should be set to 1

    From the metrics page everything looks fine to me,