Balance your big data analytics goals to enable breakthroughs

Enterprises need to balance the needs of their business cases with experimentation so they can get breakthrough answers to questions they had never thought to ask.

I recently visited with Scott Gnau, President of Teradata Labs. Scott has a large staff of data scientists under his guidance, and we were talking about the challenges of big data analytics and getting maximum value out of big data.

"Big data is an opportunity and also a challenge, because suddenly organizations are creating and storing new kinds of data that are nothing like the digitized data in transaction records that they utilized in the past," said Gnau.

For instance

Examples are everywhere. In healthcare, there is new big data coming from EMR (electronic medical records), imaging and diagnostics. In transportation and even in the monitoring of industrial and home devices, there are Internet of Things (IoT) sensors and other machine-generated data that flows into enterprise data repositories. For organizations tracking customers on the Web, there are Web logs, which are nothing more than strings of characters separated by commas that most somehow be parsed and then analyzed and grouped into profiles that analytics can interrogate.

All of this is "new science" to most companies. It is forcing them to recast traditional data warehouse mining skills into data science teams that incorporate statistical analysis and the creation of complex algorithms that can get to the crux of big data.

However, in the face of all this change, one thing hasn't changed: the needs of enterprises to know now why they are being beaten by their competitors in a particular market, or why they are losing money in their operations. This puts pressure on enterprise data scientists because they don't have the same freedom to experiment (and to fail, if necessary!) with big data that their academic counterparts do. A lot is at stake - because if enterprises fail to seek out new answers to their old business problems from big data, they are likely to get the same answers about corporate performance that they've always gotten from their traditional analytics.

How do enterprises balance their big data approaches so they meet the needs of their business cases but also enable enough experimentation with big data so they can get breakthrough answers to questions they had never thought to ask?

Four key cornerstones

#1 Hire creative risk takers

Gnau calls these individuals "artist-explorers." They look at data in new ways, they are creative, and they aren't afraid to fail. They often come from liberal arts and music backgrounds - and they may not be hardcore statisticians.

#2 Know when to cut your losses

Unlike academic institutions, enterprises have limits on how long they can afford to allow experimental work without netting returns. "The most successful creative data scientists have a "fail fast" mentality," said Gnau, "In other words, they know how to filter out noise form true symbols of intelligence that come from data and they also know when to "pull the plug" when they can see that they are pursuing a direction that is a waste of time."

The enterprise also has to create realistic checkpoints on projects - to see if they are worthy of being continued, or whether new projects should take their place.

#3 Accept and learn from failures

Pure experimentation with data can be rewarding because it positions the enterprise for breakthrough intelligence it never could have anticipated. At the same time, however, there is a very high degree of failure in the data discovery process. If an enterprise is to encourage and reward experimental research capable of producing breakthrough intelligence, endorsement for this discovery process must come from C-level executives and percolate all the way through the organization. It should be understood and expected that there is a high failure rate that goes with uncovering remarkable data - and that the solution is to try again, and not to abandon the effort.

#4 Maintain a balanced workload and budgetary approach

Data experimentation and discovery is R&D work that must be balanced with more business case-driven big data analytics projects. The ideal workload is a mix of the two so that the business sees both near-term and longer-term, potentially more far-reaching intelligence coming in. From a budgetary commitment standpoint, this balance also needs to exist - even in "down" years, when many projects are sacrificed on budgetary cutting blocks.