hadoophbaseclouderanamenode

Unable to delete HDFS Corrupt files


I am unable to delete corrupt files present in my HDFS. Namenode has run into Safe mode. Total number of blocks are 980, out of which 978 have reported. When I run the following command,

sudo -u hdfs hdfs dfsadmin -report

The report generated is,

Safe mode is ON
Configured Capacity: 58531520512 (54.51 GB)
Present Capacity: 35774078976 (33.32 GB)
DFS Remaining: 32374509568 (30.15 GB)
DFS Used: 3399569408 (3.17 GB)
DFS Used%: 9.50%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (1):

Name: 10.0.2.15:50010 (quickstart.cloudera)
Hostname: quickstart.cloudera
Decommission Status : Normal
Configured Capacity: 58531520512 (54.51 GB)
DFS Used: 3399569408 (3.17 GB)
Non DFS Used: 19777388544 (18.42 GB)
DFS Remaining: 32374509568 (30.15 GB)
DFS Used%: 5.81%
DFS Remaining%: 55.31%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Tue Nov 14 10:39:58 IST 2017

And for the following command when executed,

sudo -u hdfs hdfs fsck /

The output is,

Connecting to namenode via http://quickstart.cloudera:50070/fsck?ugi=hdfs&path=%2F
FSCK started by hdfs (auth:SIMPLE) from /10.0.2.15 for path / at Tue Nov 14 10:41:25 IST 2017
/hbase/oldWALs/quickstart.cloudera%2C60020%2C1509698296866.default.1509701903728: CORRUPT blockpool BP-1914853243-127.0.0.1-1500467607052 block blk_1073743141

/hbase/oldWALs/quickstart.cloudera%2C60020%2C1509698296866.default.1509701903728: MISSING 1 blocks of total size 83 B..
/hbase/oldWALs/quickstart.cloudera%2C60020%2C1509698296866.meta.1509701932269.meta: CORRUPT blockpool BP-1914853243-127.0.0.1-1500467607052 block blk_1073743142

/hbase/oldWALs/quickstart.cloudera%2C60020%2C1509698296866.meta.1509701932269.meta: MISSING 1 blocks of total size 83 B
Status: CORRUPT
Total size: 3368384392 B (Total open files size: 166 B)
Total dirs: 286
Total files:    966
Total symlinks:     0 (Files currently being written: 3)
Total blocks (validated):   980 (avg. block size 3437126 B) (Total open file blocks (not validated): 2)
********************************
UNDER MIN REPL'D BLOCKS:    2 (0.20408164 %)
dfs.namenode.replication.min:   1
CORRUPT FILES:  2
MISSING BLOCKS: 2
MISSING SIZE:       166 B
CORRUPT BLOCKS:     2
********************************
Minimally replicated blocks:    978 (99.79592 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks:    0 (0.0 %)
Mis-replicated blocks:      0 (0.0 %)
Default replication factor: 1
Average block replication:  0.9979592
Corrupt blocks:     2
Missing replicas:       0 (0.0 %)
Number of data-nodes:       1
Number of racks:        1
FSCK ended at Tue Nov 14 10:41:26 IST 2017 in 774 milliseconds
The filesystem under path '/' is CORRUPT

Can anyone please help in either correcting the corrupted blocks, (or) deleting them? Thanks in advance.


Solution

  • As it's said that Namenode is in Safe mode, first turn it off.

    hdfs dfsadmin -safemode leave
    

    Then execute either of commands

    hdfs fsck / | egrep -v '^\.+$' | grep -v replica | grep -v Replica
    

    or

    hdfs fsck hdfs://quickstart.cloudera:50070/ | egrep -v '^\.+$' | grep -v replica | grep -v Replica
    

    The output would be somewhat similar to

    /path/to/filename.fileextension: CORRUPT blockpool BP-1016133662-10.29.100.41-1415825958975 block blk_1073904305
    
    /path/to/filename.fileextension: MISSING 1 blocks of total size 15620361 B
    

    In your case corrupted files are already listed. So execute below commands

    hdfs dfs -rm /hbase/oldWALs/quickstart.cloudera%2C60020%2C1509698296866.default.1509701903728
    hdfs dfs -rm /hbase/oldWALs/quickstart.cloudera%2C60020%2C1509698296866.meta.1509701932269.meta
    

    And don't enter into Safemode. Just continue working. Yippee!!