Traditional BI is far too slow to be of use to machines. Matt Asay explains.
NoSQL has been a "complete game changer," according to database guru Michael Franklin. And yet, it has hardly touched the traditional Business Intelligence (BI) industry.
Oh, sure, you can get an ODBC driver to connect Tableau to Cassandra or a similar connection between MongoDB and Informatica, but these are low-fidelity connectors for a problem that demands high-fidelity data. They're not good enough.
According to DataStax CEO Billy Bosworth, however, it's also beside the point.
In an interview, Bosworth told me that the industry has split between machine BI and human BI. That is, lightning-fast, instantly actionable BI for machines and more ad-hoc querying for insights by humans.
Enterprises ultimately want both, but trying to squeeze big data into legacy tools is the wrong approach.
Thinking different about data
To date, traditional BI for NoSQL databases like Cassandra, the wide-column database backed by DataStax, has been somewhat hit-or-miss, and mostly "miss." For the most part, those hoping to connect traditional BI tools with Cassandra or other NoSQL databases have been forced into using low-fidelity ODBC drivers, "an "inefficient layer that attempts to provide SQL access," as Zoomdata CEO Justin Langseth once told me.
Such drivers can't cope with the flexible schema of a NoSQL database, so they essentially strip the richness from the data to help legacy BI tools digest it.
Not surprisingly, this isn't a winning recipe for insight.
Nor is it relevant for modern applications. As Bosworth puts it, "The new world of transactional applications requires contextual awareness. That means that the app often must morph itself from transaction to transaction to provide a better experience."
And while this certainly requires "intelligence," he says it is more "transactional intelligence" than traditional "business intelligence."
Or, said differently, it is "Machine BI" vs. "Human BI."
"'Machine BI' is intelligence that has to take place at the processing speed of a machine in order to make a transactional app smarter from transaction to transaction. Human intervention is not possible in this model, and therefore, not a design objective. 'Machine BI' is an integrated part of an application, designed into it by engineers based on business requirements. The 'learning' that occurs is put into a 'fast feedback loop' at machine speeds to make each transaction more informed or contextual when appropriate."
The primary problem with "low fidelity" ODBC connectors to Cassandra or other NoSQL databases, then, isn't really about fidelity at all. It's a matter of speed.
Indeed, Bosworth takes issue with my characterization of ODBC connectors as "low fidelity," insisting that there is no loss of Cassandra data between it and, say, Tableau. "It is just that, as with any human BI solution, if the transactional data is extraordinarily voluminous, exports and scans can take a long time, even for a human, and that can create problems that need fixing with data modeling techniques."
In other words, according to Bosworth, NoSQL and modern data technologies "isn't so much a matter of ODBC versus native connectors as it is a data modeling paradigm shift that must also occur in this new world of global transactional applications."
The rise of the machines
So, where does that leave us?
"Over time," Bosworth states, "BI vendors need native Cassandra integration to fully deliver" and "not just 'pretend everything is an RDBMS' with ODBC," but ultimately (as stated above), this is a matter of thinking about data differently.
To wit, even as data connectors get better for people to retrieve and study data, they will simply never get fast enough to keep pace with the demands of today's online, all-the-time applications. "Human fast is irrelevant compared to machine fast," Bosworth stresses, "So, bear in mind that no matter how much they improve, they are still servicing a human being vs. morphing an application in machine time."
With machines doing BI at human-impossible speeds, does this leave us to simply maintain uptime of the machines? Not according to Bosworth.
Human BI, he posits, certainly still has its important place, but it is generally around two dominant use cases:
- Longitudinal analysis of data pulled from multiple systems or applications and analyzed from many different perspectives, including "what if" analysis. For example, a general manager of a utility company interacting with data that shows regional consumption of energy along with weather information and global crude oil prices, all over a three-year period.
- Data pulled from the context of its generating application to be studied by humans (who, perhaps, do not even use the primary system directly) to derive insight or track progress. For example, a product manager accessing Salesforce data from Tableau to see selling patterns by geographical region on a daily basis.
While such human BI runs at a more languid pace, "These are valuable and necessary for business," Bosworth insists, though "They do not affect how a transactional application can be more contextual in machine time."
All of this plays out against a BI backdrop that needs to be fundamentally rethought.
By talking about "machine BI," Bosworth isn't trying to "conjure visions of exciting things like Artificial Intelligence that can outsmart humans or some such thing." Instead, he's trying to get us to think of BI in a completely different way.
Just as his daughter has no clue what he means by "roll up the window," ("Dad, why do you say roll up the window? What rolls?"), he tells me, "Sometimes our legacy language gets confusing in the new world."
There are two classes of BI in our modern world: "Machine BI" and "Human BI." They are fundamentally different and operate at dramatically different speeds. Neither is better than the other, nor can one replace the other.
Not that the machines won't try.