Two years ago, Gartner analyst Merv Adrian lambasted the "nearly non-existent response" to Hadoop's "security issue," dubbing it "shocking." "Can it be that people believe Hadoop is secure?" he asked, "Because it certainly is not. At every layer of the stack, vulnerabilities exist."
Hadoop vendors like Cloudera and Hortonworks have made their millions selling, in part, solutions to these gaping security holes. That's progress. But the real progress, as DBMS2 analyst Curt Monash noted, is that enterprises actually care. Even so, he went on to acknowledge, "data security is a real mess" and doesn't promise to get better anytime soon.
A big need for security
Even as data becomes critical to today's enterprise, securing it keeps getting harder. As Monash highlighted in his post, "technology trends have created new ways to lose data." The cloud is complicit in this compromising of data, with enterprises needing to secure and move data from third-party servers to their own, or to other public clouds. As Monash pointed out, "it is an extremely common analytic practice to extract data from somewhere and put it somewhere else to be analyzed. Such extracts are an obvious vector for data breaches, especially when the target system is managed by an individual or IT-weak department."
SEE: How to keep your big data lakes clear and navigable (TechRepublic)
Yet such movement of data is increasingly common, presenting all sorts of security issues.
This has led to more cloud-based security solutions, as I've written, but the gap between best intentions and best practices is huge. Even where enterprises feel they have a handle on data they completely control, much of their data is inter-company in nature (e.g., SaaS). As Monash stressed, "even putting your data under control of a SaaS vendor opens hard-to-plug security holes."
Vendors getting paid for security
Big data's inherent insecurity, however, is leading vendors to invest more, and get paid more, for remedying the situation. Hortonworks CEO Rob Bearden called out security as a key differentiator in his company's offerings on a recent earnings call: "We also enhanced [our] security and data science capabilities. And now all of these capabilities and functionality [are] delivered on our core multi-tenancy engine with integrated security and governance platform."
A few minutes later he went back to the security well: "Hortonworks is delivering a connected data architecture with a common operational management, security and governance framework across that same connected data platforms set as well as across cloud services." Indeed, throughout the call security was highlighted as a key component of the Hortonworks Data Platform.
SEE: CIOs still don't care about Hadoop data security (TechRepublic)
Cloudera, for its part, mentioned security even more frequently on its inaugural earnings call, starting with this opening salvo from CEO Tom Reilly:
Traditional technologies for collecting, storing and analyzing data are inadequate and [in] this era of big data and machine learning, they are technically incapable and too expensive. Organizations require a distribut[ed] platform that [is] not only designed for this purpose, but also satisfies the performance, governance, compliance and security demands o[f] a large enterprise or public sector entity. Most of all the platform must scale cost effectively and run anywhere, cloud, multi-cloud and on-premises.
Unlike Hortonworks, however, Cloudera pushes the security envelope with a hybrid model that mixes open source and its own proprietary software. Reilly also declared: "It's our proprietary software that delivers enterprise grade data governance, compliance, management, and security including encryption and key management, as well as platform as a service cloud offerings."
Whether one believes the right way to improve big data security is through open source or proprietary software, it's fantastic that vendors are finally taking it seriously. Actually, that's not fair—Hortonworks and Cloudera, among others, have long taken security seriously. What was missing was enterprise willingness to pay for that security. This gap seems to be resolving itself, to the betterment of enterprise security, not to mention the bank balances of the big data vendors.
- CIOs still don't care about Hadoop data security (TechRepublic)
- Security's future is the cloud, as enterprise trust in Amazon grows (TechRepublic)
- How to keep your big data lakes clear and navigable (TechRepublic)
- DHS, Google put up $1.5M for data scientists to improve airport security with neural networks (TechRepublic)
- Amazon's acquisition of Whole Foods highlights the business value of offline data (TechRepublic)
Matt is currently head of the developer ecosystem at Adobe. The views expressed are his own, not those of his employer.
Matt Asay is a veteran technology columnist who has written for CNET, ReadWrite, and other tech media. Asay has also held a variety of executive roles with leading mobile and big data software companies.