Open Source

Hadoop's ability to deliver business growth is worth the bother

Hadoop isn't the easiest big data technology to master, but any company looking for growth can't ignore it.

Hadoop

Hadoop is going to be big, but today, its adoption is still small. According to Gartner, there are only 1,000 Hadoop systems in production, with most companies not moving Hadoop beyond the proof of concept phase. Partly, this is a matter of difficulty: Hadoop isn't easy to master. But don't mistake Hadoop's current production footprint with its potential: those companies that understand data's value and focus on business growth can't afford to overlook Hadoop.

What's holding back Hadoop?

If anything, Hadoop's biggest problem is that it's simply not that easy to learn. Unlike other prominent big data technologies, like MongoDB, Hadoop requires significant effort to learn and master. Cloudera, Hortonworks, Splunk, and others are rapidly improving Hadoop's usability, but it remains a hurdle to broader adoption.

Small wonder, then, that two years ago, Cloudera chief strategy officer and co-founder Mike Olson posited that "Hadoop's value will be delivered through cloud apps vendors" that can hide its complexity while still delivering its value. Until then, each company must tackle Hadoop on its own. As Gartner analyst Svetlana Sicular states: "Formulating a right question is always hard, but with big data, it is an order of magnitude harder, because you are blazing the trail (not grazing on the green field)."

For those companies that make the effort, however, Hadoop pays clear dividends, which is why its adoption is accelerating. IDC projects Hadoop revenues to grow at a 60% compound annual growth rate through 2017.

For proof, look no further than Silicon Valley. Of the 1,676 Hadoop-related jobs posted today on Dice.com, 555 of them are in California, and the vast majority of those in Silicon Valley. The tech elite grok the importance of putting their data to work. No one needs to inform Twitter or Facebook of the value of their data. They get it.

Once a company realizes that its data matters, they're going to need to figure out their Hadoop strategy. Over time, we'll see more Hadoop jobs pop up in Iowa and Arizona, not just Silicon Valley.

Hadoop: It's all about growth

Fortunately for Hadoop, specifically, and big data vendors in general, nearly every company sees that their data matters. They may not know what to do with it, as Gartner found (Figure A), but they know they can't give up:

Figure A

Figure A

Top big data challenges.

The reason is growth. In a recent Gartner survey, 33% of respondents named growth as their top priority, which nearly equals the sum of the next three issues on the list of top strategic business priorities. If this were just a matter of replacing expensive data warehouses, no one would bother with learning Hadoop or any other big data technologies.

This is why we're seeing enterprises retreat from earlier expectations that Hadoop would be a "good enough and cheap" replacement technology for expensive, legacy infrastructure (Figure B):

Figure B

Figure B

Data Warehouse Reference Survey.

But in this "retreat," organizations are actually advancing. While the media may love a good rip-and-replace story, it's not very interesting to replace clunky, legacy software. It's far more interesting, and far more important to a company's prospects, to embrace technology like Hadoop to invent the future.

The future is open

That future, as a NorthBridge and Black Duck survey of IT executives showcases, will be built with open-source technology like Hadoop. In some cases, this may require a bit more effort on the part of the "buyer," because they're no longer simply purchasing commercial off-the-shelf software (which never really "just works" anyway). Instead, they're becoming co-developers of the technologies they opt to use.

We're seeing this with Hadoop. No, it's not super easy to learn, but it's open source and pays dramatic dividends to those who invest the time to learn it.

About

Matt Asay is a veteran technology columnist who has written for CNET, ReadWrite, and other tech media. In his day job, he is the vice president of business development and marketing at MongoDB. He was previously chief operating officer at Canonical, ...

1 comments
cn01
cn01

Hadoop is extremely difficult to learn and companies do not see clear benefits from its adoption. Possibly the biggest myth in the performance of analytical query processing is to deploy several Hadoop instances over several machines and perform a table scan leveraging MapReduce frameworks to compute the end result of a query. This is the most inefficient approach to information retrieval not only for the cost involved in server provisioning but also extremely slow response times to simple queries typically in minutes and even hours.

Amisa Server is easy to learn and offers clear benefits. Amisa Server offers advanced indexing capabilities to compute analytic queries in microseconds leveraging the least amount of servers.

Editor's Picks