Big data adds big value, but can be hard to understand. Here's what you need to know to get your feet wet with big data.
For the uninitiated, big data is a term for massive data sets that require specialized tools to be processed and analyzed. The popularity of big data has grown exponentially over the past few years due to an increase in data capture mechanisms and tools that help businesses glean insights from the data they have access to.
The field is exploding, but it can be difficult to navigate for those who are late to the game. Here are 10 things to help you better understand the big data phenomenon.
Big data is a big market
A recent IDC report forecast that the big data market will see a compound annual growth rate of 27% and will reach $32.4 billion in 2017. The report also forecast that a lot of data that can be considered "big data" will be disposed of or moved to the cloud, which will impact traditional datacenter revenue for storage.
Big data is messy
Data in its raw form can be extremely challenging to manage. Data scientists are often dealing with terabytes, or even petabytes, of data that could have come from six or seven disparate sources. Each individual data set could be formatted differently or contain different particular data points.
"Big data is generally not neatly formatted and today data scientists spend up to 80% of their time cleansing and preparing data," said Ping Li, an investor with Accel Partners who heads up the firm's Big Data Fund.
Variety equals value
Data can come from anywhere. That being said, it is important to pull data from both internal and external sources. According to Matt Belkin, chief strategic solutions officer at Domo, the real value is being able to overlay multiple data sources to see the story that emerges among the data sets.
"When your business data lives in isolation, you only get part of the story," Belkin said. "The real value in big data — or lots of data — is being able to identify relationships between different data sets, which often tell a more compelling and comprehensive story, allowing you to manage your business more effectively."
Data is worthless
Big data, on its own, it worth almost nothing. Big data's value comes from the insights that data scientists can derive from the processing and analysis of that data itself. As Mike Olson, CSO at Cloudera, said, "We can't act on it until we understand it, and we can't understand it until we clean it, process it, analyze it, explore it."
Big data is not for everyone, yet
Big data is still a relatively new phenomenon. Sure, many companies are racing to capture and exploit as much data as they can, but that doesn't mean that every company is currently in a position to begin a big data initiative.
"The big data hype makes it easy to believe every company is orchestrating a mind-blowing big data strategy," Belkin said. "The reality is that most companies are still struggling to find value from the "small data" and while they know what big data is, most are unsure about how to harness it and put it to use."
Big data still needs human input
The big data process still requires quite a bit of human input to properly achieve its goals. Data scientists have to manually sift through the data, refining it, and shape it before they can be used for insights.
However, Nenshad Bardoliwalla, VP of products and co-founder of Paxata, said its not feasible to keep throwing more humans at data preparation.
"It's therefore critical that we deploy state-of-the-art techniques such as machine learning and semantic analysis to automate the preparation process which, today, consumes so much human bandwidth," Bardoliwalla said.
Big data needs new tools
We have access to more data now than we have ever had in the past. Data capture mechanisms are constantly evolving to collect different types of data, and in greater quantities. Because of this, we will need tools that can keep up.
Olson called enterprise implementation of machine learning and sophisticated modeling "remarkable," noting that just a few years ago these tools were being developed in research labs. As data availability increases, so will the need for new ways to analyze that data.
Big data is pervasive
Although the big data process can be cumbersome, it can produce some valuable insights, and those insights extend well beyond your IT department. Big data insights can help you better understand aspects of your business such as sales, customer engagement, and risk. For example, users can overlay weather data in certain geographic areas to see how it impacts foot sales at retail locations.
Data is everywhere
Advances in sensor technology, and the growth of the internet of things movement mean that big data is available to companies in almost every industry.
"From agriculture to government to pharmaceuticals, big data is an ingredient that has the potential to help change/improve business practices in almost every industry," Li said.
Big data hasn't started yet
The growth of big data has been driven, so far, by traditional systems running faster and in larger numbers, according to Olson. The real growth, he said, is yet to come.
"Over the next ten years, real growth will be driven by instrumentation of the planet: Sensors in engines and buildings and roadways, for example," Olson said. "We've had those for some time, sure, but they were never connected to the network. That's changing now. As it does, it will drive a dramatic change in the character and the quantity of data available to us."