the Cloudera blog or in hortonwork forum I read::
"Every file, directory and block in HDFS is represented as an object in the namenode’s memory, each of which occupies 150 bytes, as a rule of thumb. So 10 million files, each using a block, would use about 3 gigabytes of memory"
BUT:
10000000 * 150 = 1500000000 byte = 1.5 GB.
Looks like For 3GB I need to allocate 300 bytes. I don't understand why 300 bytes are used for each file instead of 150? It's just NameNode. There should not be any replication factor.
Thanks
For every small file, namenode needs to store two objects in memory: per-file object and per-block object. This results in approximately 300 bytes per single file.