We have a 12 servers hadoop cluster(CDH), Recent, we want to decommission three of them, but this process already been running there more than 2 days. But it never ends, Especially, in the past 24 hours, I saw there are only 94G data available on the three data-node, but the size seems not changing in the past 24 hours. even through the under replicated blocks number already been zero. The replication factor is 3 for all the data in hdfs.
Below is the result for hadoop fsck command:
Total size: 5789534135468 B (Total open files size: 94222879072 B) Total dirs: 42458 Total files: 5494378 Total symlinks: 0 (Files currently being written: 133) Total blocks (validated): 5506578 (avg. block size 1051385 B) (Total open file blocks (not validated): 822) Minimally replicated blocks: 5506578 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 2.999584 Corrupt blocks: 0 Missing replicas: 0 (0.0 %) Number of data-nodes: 13 Number of racks: 1 FSCK ended at Mon Oct 17 16:36:09 KST 2016 in 781094 milliseconds
You can try to stop cloudera agent on the datanode.
sudo service cloudera-scm-agent hard_stop_confirmed
After the agent is stopped, you can just delete that datanode from hdfs instance page
Hope this works