As much as we talk about Hadoop, we should be seeing more enterprise deployments. But as Gartner recently highlighted, the promise of Hadoop and the reality remain distinctly different things.
More critically, it's very possible that the market will have moved on from Hadoop by the time it really comes into its own. While we can assume with some confidence that big data is here to stay, it's a much harder task to predict which big data technologies will dominate.
Hadoop stutters to start
As Gartner analyst Nick Heudecker pointed out, "Thru 2018, 70% of Hadoop deployments will not meet cost savings and revenue generation objectives due to skills and integration challenges."
That's a pretty damning assessment.
But it's not surprising in many ways. After all, though we've seen companies steadily moving from pilot to production in their big data projects over the past few years, we continue to see clear indications that enterprises aren't diving into Hadoop in the way we expected.
Like a distinct unconcern with security:
It's hard to imagine enterprises being serious about putting their data into Hadoop if they're not equally serious about securing that data.
This showed up in a 451 Research survey, which indicated that Hadoop deployments remain a tiny fraction of the total enterprise storage footprint:
This may be simply a question of timing.
New data workloads, new data infrastructure
Companies aren't ripping out enterprise data warehouses and replacing them with Hadoop, just as they tend not to rip and replace much of anything. If it's working, don't fix it, seems to be the motto.
Sure, we can see into the future of enterprise technology by observing what companies like Google and Facebook use today, as Neo Technology CEO Emil Eifrem highlighted. Why? Because "they are already today dealing with the volume and shape of data that everyone else will be working on in five years from now."
The way the enterprise seems to be embracing new data infrastructure like Hadoop is for net new workloads, as I wrote recently about the trend toward NoSQL databases.
The same is true of cloud computing. As Gartner's Tom Bittman described, "New stuff tends to go to the public cloud, while doing old stuff in new ways tends to go to private clouds." As such, private clouds are much closer to the old data center way of doing things.
Hadoop, then, is a big deal. It's just not pervasive yet.
And maybe it never will be in terms of core infrastructure that mainstream enterprises need to learn. As Cloudera co-founder Mike Olson said years ago, Hadoop will truly become a big deal when it's embedded in easier-to-use applications. So long as Hadoop adoption depends on enterprise IT coming up to speed on still complex technology, it's going to remain relatively niche.
However, an even bigger question is whether Hadoop will be the big data technology to know. Apache Spark is already displacing Hadoop among many of the data elite for a variety of reasons. The big data infrastructure landscape changes so fast that it's difficult to predict what will really dominate long term. The only thing we know for sure is that big data as a general category is here to stay and increasingly important.
- Why the world's largest Hadoop installation may soon become the norm
- Spark promises to up-end Hadoop, but in a good way
- How big data is going to help feed nine billion people by 2050
- Why big data analytics strikes out sometimes
Matt Asay is a veteran technology columnist who has written for CNET, ReadWrite, and other tech media. Asay has also held a variety of executive roles with leading mobile and big data software companies.