Big Data

What to do when big data gets too big

It's possible to hit a point of diminishing returns with big data collection. Here's how to avoid getting bogged down by unnecessary information.

In the 1950s, scientists proposed that the human brain can only recall a list of about seven items. Brain power hasn't changed much since then—but the amount of data that we are getting bombarded with certainly has.

"We have all of the data that we need, but what is missing in companies is people with the business acumen to take what they learn from data analytics and truly create breakthrough opportunities for the business," said Don Sullivan, product line marketing manager for VMware.

I visited with Sullivan at the recent Microsoft PASS Summit, a conference for DBAs and analytics experts who use the Microsoft and related data platforms and tools for analytics.

At the conference, I saw many fascinating database automation tools that could join disparate data sets and perform acrobatics with databases, data and applications—but it was also a conference remarkably lacking in discussion about how to take these treasure troves of data and transform them into meaningful business value.

I came away with a question: How much big data (and analytics) is enough to obtain business value?

SEE: Hiring kit: Data architect (Tech Pro Research)

In network monitoring and manufacturing, we have machines talking to each other on production floors and end points in corporate networks talking to each other. The machines and endpoints collect and transmit valuable nuggets of information—but these nuggets are also embedded in a stream of worthless machine gibberish. Does a network administrator need this?

Smart barcode labels can now carry as many as 7,000 characters of data about an item. For example, a barcode on a sweater might tell you how many stitches a sweater is composed of. But do you need this if your job is just to make sure that the item has left the manufacturer on time and will be at the warehouse or retail store in time for the holidays?

In other words, whether we are talking about network outputs, sweaters, or TV video signals, there appears to be a point of diminishing returns where the value you get from your data begins to decline.

There are two primary trigger points for diminishing data value:

  • Data begins to be produced without a business case for producing it
  • Data is presented with so much complexity that users simply don't know what to do with it

Here's how you can counteract these triggers:

Strategies for data without a business case

  • Always define a clear business case with desired outcomes (e.g., reduce operating costs in manufacturing) before designing data marts, plugging in IoT, etc. The narrower you focus your goals, the more likely your staff will be able to stay on task.
  • Each week, check your data analytics projects for project "drift." In other words, did the project start drifting away from the business case it was intended to solve? If you see a project starting to drift, correct course and get back to the business case.
  • Never leave an analytics project in the hands of technicians alone. If you lack a person who is savvy about the business, the project might be technical masterpiece, but a business failure. To avoid this pitfall, a business savvy end user or IT business analyst should be in the lead role to ensure fidelity between the project and the business goal.

SEE: Microsoft Power BI: Getting started with data visualization (TechRepublic)

Strategies for overly complex, and even too much, data

  • Understand before you start designing an analytics application, what the end user needs. If the user is working outdoors in a cold and rainy freight yard, they shouldn't have to struggle with layers of drill-down menus on a mobile device. A single visual that shows the locations of trailers and highlights critical issues might be sufficient.
  • Stick with the the bottom lines of your business case. If the goal is to see how many flu cases are in various districts of the city go with it. Tomorrow's project might be to add additional demographics like an overlay of unemployment rates, etc., but it is not today's work.

Also see:

istock-598554190.jpg
Image: iStock/AndreyPopov

About Mary Shacklett

Mary E. Shacklett is president of Transworld Data, a technology research and market development firm. Prior to founding the company, Mary was Senior Vice President of Marketing and Technology at TCCU, Inc., a financial services firm; Vice President o...

Editor's Picks

Free Newsletters, In your Inbox