Apache Hadoop and it's Distribution

To know about Hadoop deployment in the industries we have to be aware of it's distributors. Disctibutors of Hadoop are those companyes that provides Apache Hadoop-based software, support and services, and training to business customers. There are multiple vendor which exist in the market among whom two are the most widely used Apache Hadoop distribution
i.  Cloudera
ii. Hortonworks

Cloudera v. Hortonworks: Tale of the Tape 
Cloudera has plenty to boast about. It has in fact contributed significantly to the open source Apache Hadoop project and its Hadoop distribution is in production at high-profile Web companies like Groupon and Klout. It launched an innovative partner and certification program in September and Cloudera engineers continue to develop new features to help Hadoop meet enterprise-level uptime and security requirements.

In addition, Cloudera has a two-year head start over Hortonworks servicing a small but growing customer base. No question the Hortonworks team learned many valuable lessons working at Yahoo, but supporting an internal Hadoop deployment at one large technology company is a lot different than supporting a large and varied customer base of both technology and non-technology companies. In order for Hortonworks to become a self-sufficient Hadoop support juggernaut, Baldeschwieler’s stated goal, the company needs to prove it can deliver.

Finally, consider the competing Hadoop distributions themselves. Their cores are both based on the open source Apache Hadoop distribution and related sub-projects, with the real differentiation being the installation and administration management add-on tools. Cloudera Management Suite, while proprietary, includes important enterprise-level features such as automated, wizard-based Hadoop deployment capabilities, dashboards for configuration management and a resource management module for capacity and expansion planning. Ambari, Hortonworks' answer to Cloudera Management Suite, is open but is less mature and currently lacks advanced cluster management capabilities.

The reality is that Cloudera’s Hadoop distribution is largely open source and the risk of vendor lock-in due to its relatively few proprietary components is, in Wikibon’s opinion, lower than what Hortonworks marketing implies. Organizations that come to rely on Cloudera Enterprise for crucial parts of the business but later decide to move to a different Hadoop distribution or competing Big Data approach should be able to do so with little difficulty.

That said, Hortonworks’ open 100% approach means that updates and improvements to its distribution are likely to come quicker than those of Cloudera’s distribution and that partners may find it easier to integrate with HDP than Cloudera Enterprise. These are not insignificant factors that potential customers must consider.


2 comments: