we have HDP cluster version 2.6.4 with ambari platform
from ambari dashboard we can see Blocks with corrupt replicas with 1
and also from
$ hdfs dfsadmin -report
Configured Capacity: 57734285504512 (52.51 TB)
Present Capacity: 55002945909856 (50.02 TB)
DFS Remaining: 29594344477833 (26.92 TB)
DFS Used: 25408601432023 (23.11 TB)
DFS Used%: 46.19%
Under replicated blocks: 0
Blocks with corrupt replicas: 1 <-----------------
Missing blocks: 0
Missing blocks (with replication factor 1): 0
in order to find the corrupted file we do the following
$ hdfs fsck -list-corruptfileblocks
Connecting to namenode via http://master.sys76.com:50070/fsck?ugi=hdfs&listcorruptfileblocks=1&path=%2F
The filesystem under path '/' has 0 CORRUPT files
but as we can see above we not found the file
also we did the following in order to delete the corrupted file
hdfs fsck / -delete
but still Blocks with corrupt replicas
still remain with 1
any suggestions?
A file signalled as having "blocks with corrupted replicas" in not a corrupt file neither means it have a "corrupt block" or has lost any data.
A "block with corruputed replicas" is a block where at least one replica is corrupt BUT can still be recovered by the remaining (majority of the) replicas which means the content can be recovered from those replicas.
Also the fsck
command will not tell you nothing about files with blocks in that state, because it only check if files and blocks in the filesystem are OK, and since those can be auto-fixed by HDFS they will not be reported.
The only command that will report those files is the hdfs dfsadmin -report
command and this is used by Ambari to raise the warn.
As far as I know, the only way to recover from these warnings is wait for HDFS to auto-fix them, by replacing the corrupted replicas with god ones from the other datanodes with good replicas.