azurereplicationhadoop3

How scheme of Replication works in hadoop3.x?


As per my understanding in Hadoop-3.x they introduced the support for Microsoft Azure Data Lake filesystem and the scheme of Replication is changed. Are these only the major features, are any other new features they are introducing as part of hadoop-3.x.?

And how scheme of Replication works in Hadoop-3.x


Solution

  • Hadoop 3.0 will incorporate Erasure Coding in place of replication consuming comparatively less storage space whilst providing same level of fault tolerance.

    Hadoop 3x replication scheme leading to 200% additional storage space and resource overhead.

    For more details, refer “HDFS Erasure Coding”.

    To understand the how HDFS Erasure Coding works, Refer: “Introduction to HDFS Erasure Coding in Apache Hadoop”.