Advances in data networks and storage mean organizations
more data than they ever have – perhaps a stream of measurements from
manufacturing equipment, from vehicles, or from game-changers like web-enabled
refrigerators (no, I’ve never seen one either).
The enterprise CTO may have the data storage part all figured
out – their MongoDB cloud database
is in place, or they rent DBaaS from Cloudant.
But why? What does an enterprise do with all this unstructured data?
The first thing is to identify what the enterprise wants.
Analytics can be an area of blind faith – if the enterprise is not
clear about its big data needs, it may just hope that something good pops
Identify the big data
data analytics, like all IT, is subordinate to business needs. An
organization must figure out their requirements before working on big data.
No two organizations are the same, so there is always a
variation in needs. The IT department may receive requirements like these.
- Crunch data for instant reports.
- Decode telemetry on the fly.
- Find a needle in a haystack in a vast quantity
- Find the regular operational patterns in a vast
quantity of signals.
Analytics is a service-oriented area so the CTO could just finish
his work there and outsource the rest. If he decides to keep it in-house, he
needs a few more things.
Get some analytics
Analytics applications help turn large data sets into business
value. The enterprise uses analytics tools to tackle the difficult job of doing
something useful with their unstructured data.
Data analytics products are one of the big
data technologies and live in a data scientist’s toolbox. Analytics products
don’t usually deliver ready-made business value.
When an organization purchases analytics applications, they
must leave plenty of cash for the training budget. Complex tools are not
Write a big data policy.
Managing large data sets is a difficult job. The big data
manager has plenty of moving parts to configure to meet these requirements.
- What is the retention policy? What parts of the
data pool can be deleted, and when? What happens to the rest of the historical
- What is the data protection policy? Who gets to
view data? What are the privacy implications? What are the legal restrictions?
- Where is the data stored? If a cloud provider is
holding the data, how do we get it back?
- What kind of meta-data is required? How can
anyone identify the purpose of a big data store?
- How many data sets are there, and how can they
Assemble an analysis
The first part of building a team is partnering up a business
executive and an IT sponsor. Both are required.
There may be a data warehouse and data miners in the
organization, but probably no data
scientists. There are a few ways of getting some.
- Hire experts. Pros
are in demand.
- Hire people with the right capability and let
- Spot the budding statisticians in your
organization and grab them.
Spotting capability means looking for clues. John Foreman is chief scientist at
Mailchimp and writes a blog on
data science. If someone is a fan of his work, that’s a clue. Perhaps one
of the data miners has an artistic streak. The person obsessively dragging
consumer behaviour out of click trails is worth talking to.
That still leaves some
A few huge organizations, like telecoms companies and global
retailers, have been battling with the problem of analytics for decades. They
have specialist teams, home-grown tools, and years of experience. Alongside
their expensive specialized capabilities, a brave new world of big data and commoditized
data analytics is appearing. There is quite a way to go.
enterprise is doing new things with existing data sets, rather than
collecting new data.
- Plenty of big
data tools exist, but few tools ready for business users.
- Organizations in many parts of the world have
not started exploiting big data.
machine learning is required to extract signal from noise.
It takes statistical, technical and business expertise to get value
from big data. Even where the analytics tools exist, they must be tailored for
business needs – it’s not a one-size-fits-all world.
Over to you, big data startups around the world.
Plug those gaps.