Big data initiatives are a high priority for 60% of enterprise organizations, according to a January 2014 IDG Enterprise study. Some people say the future of big data is Apache Hadoop. For now at least, there's "no doubt about the importance of the market for Hadoop," writes Deborah Gage in The Wall Street Journal.
But while it has great potential as a big data solution, Zettaset CEO Jim Vogt says that Hadoop was not created with security in mind. What his firm wants to do with its flagship product Orchestrator is make Hadoop more secure and hardened for enterprise-level performance and analytics.
"There are a lot of things missing in the code itself," said Vogt about Hadoop in a recent telephone interview. "It's rather hard to install, it really doesn't have a very robust, high availability, and the foremost thing it doesn't have is security."
Security concerns, along with compliance, are factors in the Hadoop market. "Most people are holding back on deploying Hadoop," said Vogt, "especially financial and healthcare environments, where they have hard compliance [demands], because it just doesn't have the security features that they require. So compliance is driving a lot of need for securing this technology."
"We are basically an enterprise software company," Vogt said about Zettaset. "What we focus on is transparently augmenting the open source around Hadoop so that we can make it hardened for the enterprise." Emphasizing the company's commitment to transparency, he added, "We are trying to be as open as we can in providing our big data management solution."
Vogt gave a piece-by-piece description of how Orchestrator is built for addressing Hadoop big data security in the enterprise.
"On the security side," said Vogt, "this is a batch system, this is a distributed computing system, so it's a little more difficult in terms of how you secure the crypto-assets, the keys, and how we actually implement encryption."
"What we had to develop first," explained Vogt, "was a real extensive RBAC umbrella, role-based access control. We are not re-creating the roles and policies, these already exist on an RBAC server or an active directory, most likely. We link those, and essentially pass those through to this infrastructure, transparently, again."
After the RBAC umbrella is built, said Vogt, "the data can be anywhere. It can be encrypted or not encrypted, it can be duped across the clusters. You've got to have a [security] policy to address who gets access to what, and what people see in the clear unencrypted format."
"Our first umbrella," added Vogt, "we laid out in April of last year. And then we started layering in the security services, and then we have encryption for data at rest."
"We also just announced encryption on data in motion, and that's really securing the data as it moves between the nodes, within the cluster. And it also secures our console. So there's no means by which you can actually get in, tap into the network, and essentially extract critical data from the store."
The last piece in the setup, according to Vogt, is what Zettaset calls a BI Connector.
"This is a standard ODBC/JDBC connector, that allows us now to integrate these enterprise applications, so when you authenticate and log in to the applications, and attempt to run reports, basically we are taking those roles and policies, and applying them to that end user and being able to enact security for that application."
"Now you've got all the pieces," said Vogt. "You've got a full end-to-end secure solution, where essentially you are open at one end to all the various applications, and you're open at the other end in terms of all the open source solutions that we can run on."
Through its BI Connector, Zettaset is working on developing its ecosystem of analytics solution and data management providers that it partners with. In April 2014, the company announced technology alliances with Simba and Revelytix.
TechRepublic: What are the elements of a robust big data security strategy?
Jim Vogt: One of the keys to supplying the security infrastructure, what Zettaset does, is to mask the complexity of the solution. You shouldn't have a team of eight Hadoop engineers, you don't need that.
So one of the trends is repeatability and scalability and ease of deployment of these security policies. People have a lot of the stuff already in place. It is not like this is the first technology that you are having to secure.
What is of value to the customer is to link all of those existing security policies to the infrastructure, so that you don't have to recreate all of that infrastructure. And that's what I find kind of amazing about this market. Why are we trying to re-create a whole approach in Hadoop when it already exists?
These guys have been so sucked into Hadoop, they try to re-create a whole new ecosystem within Hadoop, but there are mixed components of open source out there that actually have applicability to the solution.
But basically, it's not just about managing Hadoop, guys. It's about building robust, secure stores for these mixed data sources, so that people can actually do something with it.
And the other interesting thing about our [approach] is that we are the least glamorous and most critical component of the solution. If you can't build a store, the way we are building it — a scaleable, large parallel repository around a batch system that is secure, easy to deploy, and not resource-intensive — then forget about the analytics.
So, do it with software, do it securely, do it in a repeatable fashion, and automate it. That's what you need to do, and that's what will get the market to move.
Brian will do client work for AtTask.
Brian Taylor is a contributing writer for TechRepublic. He covers the tech trends, solutions, risks, and research that IT leaders need to know about, from startups to the enterprise. Technology is creating a new world, and he loves to report on it.