What is the "time value" of data? "Big data doesn't quite expire but it becomes horribly stale," said SumAll CEO Dane Atkinson in a recent email Q&A with TechRepublic. "Data is at its most potent form in the moments after it is collected."
SumAll was founded in 2011 with the conviction that "everyone, not just huge enterprises, should have unrestricted access to their data to improve their business and lives." The company provides a data analytics tool that enables customers to visualize social media, e-commerce, advertising, email, and traffic data to produce a holistic view of a firm's activities.
Atkinson used the weather as an example of the "time value" of data. The first tier is an immediate response to an imminent threat, such as a tsunami. The next tier is several days of data, such as acting on the news of a snowstorm or an approaching hurricane. Data of weeks and months show you patterns about crops and travel plans. And the "long tail" data helps you see large patterns, such as climate change.
He explained this analogy in terms of internet advertising, and also discussed the "mismatch" of expectations regarding analytical insights, and SumAll's mission and solutions.
TechRepublic: In our initial contact, you raised an interesting question: Does big data expire? How fast does data have to be used in order for business to benefit from it?
Dane Atkinson: Big data doesn't quite expire, but it becomes horribly stale. Data is at its most potent form in the moments after it is collected, nice and fresh. The biggest potential for impact comes in the reaction to that moment. But to leverage that opportunity you need to have a data design geared for a result.
A classic big data version here is to think of weather. The immediate reaction, for instance, is to watch for distinct data patterns that indicate a tsunami, then immediately warn populations at risk. Clearly a big impact from having the data and saving lives. The next tier is data from the past several days, which in our "tragedy" analogy here would be hurricanes, snowstorms, and the like. While not quite as immediate, they also create long-term impact. Next are patterns in the span of weeks and months, which affect crops, travel, and a plethora of important but less immediately actionable items. Lastly the long tail helps us understand large patterns, global warming and the like. For the longest data set, the detail resolution is less critical.
Taking that example to the internet, the immediate reactions can be seen in ad tech. Watching web users surf and immediately serving the exact right ad has the highest conversion impact. Taking that data and triggering a targeted outreach minutes and hours later is the next tier. Data over the several next weeks are mostly used for optimizing your business funnels and products to stay aligned with the market. The really long-term data, like weather, helps see bigger trends but in itself does not create any of the impactful automations.
To give a quick sense of the value in data, here is an illustration (Figure A) modeling our experience with the big stuff!
TechRepublic: In addition to simple and fast access to data, what are the main challenges that companies face with big data?
Dane Atkinson: The expectation mismatch is a huge problem! Everyone is building monster big data reservoirs, and somehow they thought that the reservoirs would act like a smart oracle that you could cast a coin into and get out a magical answer. Nuts! That thought has created giant cost centers that require so much overhead there is rarely a moment for the analyst to think. You end up with a "hoarder's" closet where asking any question takes days to turn around. It is considerably better to build pipelines or rivers for your data to flow, with "water wheels" placed all along the side to leverage IT.
Away from the "closet" metaphor — companies license cutting edge databases, put them on SSD, and find they can barely run queries. The dream here is that with all that information, you can find a subtle pattern revealing exactly how to triple your revenue. In reality, folks get worn out asking their analyst questions only to get a result days later, and analysts are tied down in cleaning the giant pool. A far better path is to put some event processing on a single river, and have a data feed looking for the right pattern.
TechRepublic: What was the founding mission of SumAll in 2011?
Dane Atkinson: When we started SumAll we first focused on culture, as great things are built by great cultures, not strategy. We then took on a challenge to be proud of. We saw data as the next big driver to human improvement. That inside the volumes of information created by us all the opportunity for betterment was extreme! Looking into the landscape a new divide was coming. One where big enterprises were leveraging their customer data to better achieve their own ends, and only the customers when it helped their goals.
Considering how powerful data is, a future where businesses and people are nearly invisibly influenced to act against their own interest seemed certain. We wanted to play a roll in blocking that. Giving the data back to those who created it seemed the first step. So we built a tool that without engineers or analysts would at least give you back your data cleanly and freely. It's been a rocket ship ever since!
TechRepublic: What are the main features and benefits of SumAll's solutions?
Dane Atkinson: We have far to go still, but in our three years we have at least made the data return to its owner. More than that we allow it to sit next to other data that never collides. In practice that means a business can add in their revenue data, social data, ad data, traffic data, support data, and much more with just the user name and passwords to the various systems. The owner can then finally see what ties together, whether ads get sales, whether social works, where he is spending his energy.
We will forever keep that free and grow the footprint of data we cover! We will grow into bringing those instant impacts and automations.
TechRepublic: At present SumAll's Basic Service is still free. Will this change?
Dane Atkinson: Never! We believe that the creators of "exhaust data" [according to The Economist, data exhaust is "the trail of clicks that internet users leave behind from which value can be extracted"] should own it, have access to it, for free. We will charge for nifty things that can be done with it later but never to store or access. We are closing in on a million businesses on the platform with millions of data streams pulling every few minutes! That's cool!
TechRepublic: Regarding big data, do you think there is an important issue that's not getting enough attention?
Dane Atkinson: Big data is a game changer nearly on a level with the internet itself. Just like the internet had the potential to divide humanity, so does big data. We have mostly won that battle but as data is far less obvious, we worry we may not. Presently, giant enterprises use big data to maneuver the population into behavior patterns they desire, buying things. It has become accepted that small business and individuals create data in their every activity but that data isn't owned or even accessible by the creators. That data might be marginalized by calling it "exhaust data," but it should work for its creators and not for the companies trying to manipulate them.
Dane Atkinson: It is CRITICAL. I am astonished at how few people cultivate mentors and am sure the few that do find better outcomes. It is also far from a one-way street, it really never has been. The mentor draws energy, hopefulness, ambition, and risk tolerance from mentoring and hopefully returns some guidance to the pitfalls that maybe avoided. My mentors were indispensable to me, and now those I mentor in Techstars and elsewhere provide a key to continued success.
Brian will do client work for AtTask.
Brian Taylor is a contributing writer for TechRepublic. He covers the tech trends, solutions, risks, and research that IT leaders need to know about, from startups to the enterprise. Technology is creating a new world, and he loves to report on it.