hadoopversions

Why are the hadoop releases not in the same order as their numbers?


I've visited the website to download the latest version and I found that 2.8.4 was released after 2.9.1. Why does that happen? And which one should I download?


Solution

  • Why are companies still running Java 6 and 7 while they are end of life? Why is Java 8 still updated when Java 9 and 10 are available?

    My point is that at one point, Hadoop 2.7.x was the stable branch. 2.8, 2.9 introduce some potentially breaking or otherwise major, possibly unstable change. The previous releases still need support to address bugs and backport useful features. You're welcome to read the release notes to see what those may be.

    It's worth mentioning that the Hadoop vendors like Hortonworks and Cloudera are currently using some version 2.7 with some patches applied on top of what you'd get on the Apache site.

    Meanwhile, if you want the latest and greatest, and don't care about stability, you can use Hadoop 3.x, but if you want other things like Spark, Sqoop, HBase, Hive, then I'd suggest staying at 2.7 for now. Or at least read over the documentation for each component and see if you can find installation requirements.