Versioning of Hadoop


The main versions or branches of Hadoop are
1.      Version 0.20.0–0.20.2: - The 0.20 branch of Hadoop is said to be the most stable version and is the most commonly used version in production. The first release was in April 2009. Cloudera CDH2 and CDH3 are both based off of this branch.

2.      0.20-append: - This version includes the support for file appends in HDFS which was needed for Apache HBase and was missing in version 0.20. This branch with the file append feature was called 0.20-append. No official release was ever made from the 0.20-append branch.

3.      0.20-security: - Yahoo, one of the major contributors to Apache Hadoop, invested in adding full Kerberos support to core Hadoop. It later contributed this work back to Hadoop in the form of the 0.20-security branch, a version of Hadoop 0.20 with Kerberos authentication support. This branch was later released as the 0.20.20X releases.

4.      0.20.203–0.20.205: - There was a strong desire within the community to produce an official release of Hadoop that included the 0.20-security work. The 0.20.20X releases contained not only security features from 0.20-security, but also bug fixes and improvements on the 0.20 line of development. Generally, it no longer makes sense to deploy these releases as they’re superseded by 1.0.0.


5.      0.21.0: - The 0.21 branch was cut from Hadoop trunk and released in August 2010. This was considered a developer preview or alpha quality release to highlight some of the features that were currently in development at the time. Despite the warning from the Hadoop developers, a small number of users deployed the 0.21 release anyway. This release does not include security, but does have append feature.

6.      0.22.0: - In December 2011, the Hadoop community released version 0.22, which was based on trunk, like 0.21 was. This release includes security, but only for HDFS. Also a bit strange, 0.22 was released after 0.23 with less functionality. This was due to when the 0.22 branch was cut from trunk.

7.      0.23.0: - In November 2011, version 0.23 of Hadoop was released. Also cut from trunk, 0.23 includes security, append, YARN, and HDFS federation. This release has been dubbed a developer preview or alpha-quality release. This line of development is superseded by 2.0.0.

8.      1.0.0: - Version 1.0.0 of Hadoop was released from the 0.20.205 line of development. This means that 1.0.0 does not contain all of the features and fixes found in the 0.21, 0.22, and 0.23 releases. It does include security feature.

9.      1.2.1: - The stable version of 1.2 line version was released on 1 Aug, 2013.

10.  2.0.0-alpha: - In May 2012, version 2.0.0 was released from the 0.23.0 branch and like 0.23.0, is considered alpha-quality and is the first version in the hadoop-2.x series. This includes YARN and removes the traditional MRv1 jobtracker and tasktracker daemons. While YARN is API compatible with MRv1, the underlying implementation is different. This includes
o       YARN aka NextGen MapReduce
o       HDFS Federation
o       Performance
o       Wire-compatibility for both HDFS and YARN/MapReduce.
11.  2.1.0-beta: - Hadoop 2.1.0-beta consists of the below significant improvements over the previous 1.X stable releases.
·         HDFS Federation
·         MapReduce NextGen aka YARN aka MRv2
·         HDFS HA for NameNode (manual failover)
·         HDFS Snapshots
·         Support for running Hadoop on Microsoft Windows
·         YARN API stabilization
·         Binary Compatibility for MapReduce applications built on hadoop-1.x
·         Substantial amount of integration testing with rest of projects in the ecosystem

No comments:

Post a Comment