Datameer is a set of big data analytics best practices, says Karen Hsu. Its 4.0 release empowers clients with instant visualization at each stage of analysis, maximizing time and productivity.
Hadoop big data analytics firm Datameer released the 4.0 version of its flagship product in March 2014. What's new in this release is the "flipside" feature to the spreadsheet interface, which allows a user to visually check the data during each step of the analytics process. Its value proposition for Datameer 4.0 is maximizing time and productivity in data analytics.
"We are consistently pushing the envelope to make big data analysts as productive as possible, and studies show the fastest way to digest information is visually," said Datameer CEO Stefan Groschupf in the 4.0 press release. "With Datameer 4.0, analysts no longer need to wait until the final visualization to gain insights from their data. This new paradigm means companies will realize meaningful ROI on their big data analytics projects faster than ever before."
Karen Hsu, Senior Director of Product Marketing at Datameer, described the product as "a set of best practices on how to do big data analytics" in a recent telephone interview. Summing up the main elements of the 4.0 release, she said: "integration, analytics, and visualization."
Describing the Datameer product, she explained that "we are native on Hadoop, that means we support any distribution, and on top of the distribution, we have an end-to-end data analytics solution. That includes the data integration — we have over 55 pre-built connectors that allow you to automatically bring in data and automate the process of bringing in data from any source."
"I came from Informatica," said Hsu, "so I examined (the data integration) part of the product very closely. I was very impressed with how easy it was for an analyst to use. Our product has been targeted to analysts, as well as to IT people who are not savvy with big data. And so this cuts down on the need to hire an extremely expensive data scientist to do some of this work."
"Second of all," added Hsu, "we have an analytics solution, which is built around a spreadsheet interface. So again as an analyst I can take the experience that I've had with Microsoft Excel my entire life, and reuse it. 4.0 has many pre-built functions that help me not only parse the data, but also analyze it. So I can group the data, I can do math functions, identify outliers, and identify patterns. And, there are text mining functions, and logical functions that help me do that."
"The third part of the product is visualization," said Karen Hsu. "Once you have the the analysis you can show it visually in a bar chart, or a time series, or whatever. So we have an end to end solution that covers the whole spectrum."
"The different parts that I just talked about — the integration, the analytics, and the visualization — are often three different products," explained Hsu. "And we have put them in one. That's important because of how iterative big data and analytics are, and that's really the focus of 4.0."
With big data, said Hsu, "you have data coming in from many different places, in all different shapes and sizes, and it's extremely hard to bring all this together. As a result, an analyst needs to work very hard to get data into a form that is usable."
Their clients have been feeling the pain. "What we've heard from our customers is that they will spend up to 80 percent of their time just doing that — acquiring the data, cleaning it, all the data wrangling, the data manipulation that is needed to actually make the data usable."
From Hsu's viewpoint, there's a problem using multiple tools for analytics: "Every time you take the data out and put it in another tool you definitely introduce quality issues, not to mention the learning curve problems, licensing and IT related issues, and managing so many different tools."
"The last issue is visualization at the end," said Hsu. "Other tools do the visualization after you acquire the data, after you manipulate the data, at the very end, which is a problem. If I have to wait until the end to see that the data I had was not correct, then I just wasted a whole bunch of time and ended up with bad results, and I have to start all over again."
"What we are addressing," with 4.0, explained Hsu, "is being able to up-front decide where the issues are in the data, being able to say this is a solution and that I as an analyst can do it myself. And finally, I can use one tool, I can use one product. I don't have to go in and out of five different tools—I have one platform."
Asked about Datameer's immediate business goals, Hsu said "we have a remarkable customer base and hundreds of customers. For the next year the company is focused on scaling that, continuing to blow out those numbers, not only increasing the number of new customers, but also retaining the customers that we do have. So far we've done that, we just want to blow that out."