I recently visited with Scott
Gnau, President of Teradata Labs. Scott has a large staff of data
scientists under his guidance, and we were talking about the challenges of big
data analytics and getting maximum value out of big data.
“Big data is an opportunity and also a challenge, because
suddenly organizations are creating and storing new kinds of data that are nothing
like the digitized data in transaction records that they utilized in the past,”
said Gnau.
For instance
Examples are everywhere. In healthcare, there is new big
data coming from EMR (electronic medical records), imaging and diagnostics. In
transportation and even in the monitoring of industrial and home devices, there
are Internet of Things (IoT) sensors and other machine-generated data that flows
into enterprise data repositories. For organizations tracking customers on the
Web, there are Web logs, which are nothing more than strings of characters separated
by commas that most somehow be parsed and then analyzed and grouped into
profiles that analytics can interrogate.
All of this is “new science” to most companies. It
is forcing them to recast traditional data warehouse mining skills into data science
teams that incorporate statistical analysis and the creation of complex
algorithms that can get to the crux of big data.
However, in the face of all this change, one thing hasn’t changed: the needs of enterprises
to know now why they are being beaten
by their competitors in a particular market, or why they are losing money in
their operations. This puts pressure on enterprise data scientists because they
don’t have the same freedom to experiment (and to fail, if necessary!) with big
data that their academic counterparts do. A lot is at stake – because if enterprises
fail to seek out new answers to their old business problems from big data, they
are likely to get the same answers about corporate performance that they’ve
always gotten from their traditional analytics.
How do enterprises balance their big data approaches so they
meet the needs of their business cases but also enable enough experimentation
with big data so they can get breakthrough answers to questions they had never
thought to ask?
Four key cornerstones
#1 Hire creative risk takers
Gnau calls these individuals “artist-explorers.” They
look at data in new ways, they are creative, and they aren’t afraid to fail. They
often come from liberal arts and music backgrounds – and they may not be
hardcore statisticians.
#2 Know when to cut your losses
Unlike academic institutions, enterprises have limits on how
long they can afford to allow experimental work without netting returns. “The
most successful creative data scientists have a “fail fast”
mentality,” said Gnau, “In other words, they know how to filter out
noise form true symbols of intelligence that come from data and they also know when
to “pull the plug” when they can see that they are pursuing a direction
that is a waste of time.”
The enterprise also has to create realistic checkpoints on
projects – to see if they are worthy of being continued, or whether new
projects should take their place.
#3 Accept and learn from failures
Pure experimentation with data can be rewarding because it positions
the enterprise for breakthrough intelligence it never could have anticipated.
At the same time, however, there is a very high degree of failure in the data
discovery process. If an enterprise is to encourage and reward experimental research
capable of producing breakthrough intelligence, endorsement for this discovery
process must come from C-level executives and percolate all the way through the
organization. It should be understood and expected that there is a high failure
rate that goes with uncovering remarkable data – and that the solution is to
try again, and not to abandon the effort.
#4 Maintain a balanced workload and budgetary approach
Data experimentation and discovery is R&D work that must
be balanced with more business case-driven big data analytics projects. The
ideal workload is a mix of the two so that the business sees both near-term and
longer-term, potentially more far-reaching intelligence coming in. From a
budgetary commitment standpoint, this balance also needs to exist – even in “down”
years, when many projects are sacrificed on budgetary cutting blocks.