Big Data

Real-time data science: How to embrace this new reality

When applying data science to real-time applications, don't stop until you reach the top of the DIKW pyramid.

Image: Picasa

A client asked me to run a weeklong workshop in Houston, Texas. When the workshop ended, I went straight from my client's office to the airport with the help of Google Maps. I found it interesting that the app routed me through the side streets instead of the more direct freeway route. I'm guessing it did this because it "knew" there was heavy traffic on the freeway, and the app was able to calculate the quickest route in real time based on my current conditions.

This inspired me to think about how data science can be applied in real time, and it brings to mind the Data, Information, Knowledge, Wisdom (DIKW) pyramid.

Working your way to the top

The top of the DIKW pyramid is where the most valuable products are waiting to be developed. I frequently reference this model because I feel it represents a logical evolution of sophistication when it comes to solutions that involve data and analysis.

Each level of the pyramid builds on the level beneath it. As such, wisdom is the pinnacle that represents the highest level of analytic sophistication, and that's where you should push your data scientists to go to be competitive with your products and services that incorporate data science. Here's how to apply this evolution to a real-time analytic application.

Data

Your product's data are the fundamental building blocks of the application. Data are raw, uncultured deposits of bits and bytes that help your customers or end users accomplish whatever they're trying to do. Although this is your starting point, don't underestimate the economic value of providing customers with data, especially if you deliver it in real time. The only thing many day-trading quants need or want is a real-time market feed. Since they typically build their own sophisticated systems to analyze the data, they're fine with just a real-time data feed.

Data is still powerful for real-time applications — even if it isn't refreshed on a frequent basis. My 2002 BMW had an ancient GPS system that relied on CDs. That said, it got me around town just fine, even though the data on the CDs were several years old.

Information

As data scientists, we can do a lot better than just data. The next level is information, and it adds the "so what?" to the mix. Most end users don't want to spend time figuring out what all the data means, so your data scientists should answer that for them.

Since real-time applications are typically time-sensitive, most people don't want to be blasted with raw data — they want to know what to do as a result of the data. Displaying a map of your location is fine; however, we also expect our GPS system to tell us when to turn left and right. This is the information we're looking for.

Knowledge

Information-based products are so last decade. We can do better than that by adding knowledge. Knowledge is connecting relevant dots from disparate sources to bring us awareness and insights that we wouldn't have otherwise.

For instance, the Google Maps app is able to bring together road information, current traffic information, and probably current information about construction and roadblocks, and advise me on the quickest route to take based on my current condition. This is impressive, and it may be cast as analytically sophisticated, but in my opinion it's not. The sophistication lies in the real-time convergence of relevant information (knowledge), but I'm guessing the actual analysis is not that sophisticated.

Wisdom

This is where wisdom comes in. Wisdom is when your application gets really interesting and extremely valuable if it's done right. Wisdom is taking everything we've done so far, and then applying advanced analytical models to take the product to the next level.

Where knowledge can tell you the right course of action based on current conditions, wisdom can tell you the right course of action based on what's predicted to happen in the near future. There may be heavy traffic on the freeway now, but it may dissipate soon. So, in spite of the current gridlock, it may be best to stay on the freeway for a short while until things clear up. The only way to know this is with artificial intelligence of the traffic patterns in that area.

I don't know whether Google Maps incorporates this level of sophistication in their analysis, but I doubt it. If you could pull this off, you'll have customers for life.

Summary

We're becoming a real-time society. Information was valuable at one time, but now it's a commodity. Not only do people expect immediate access to whatever information they need in the moment, but they're valuing applications that can bring them new insights and advise them on what to do with these insights.

To be competitive with data science, it's important to embrace this reality. My advice is to use the DIKW pyramid as a framework for progressively evolving your data science products and services, and don't stop until you reach the top. When benchmarking against your competition, you don't want to be looking up.

Also see

About John Weathington

John Weathington is President and CEO of Excellent Management Systems, Inc., a management consultancy that helps executives turn chaotic information into profitable wisdom.

Editor's Picks

Free Newsletters, In your Inbox