SAP has been boasting about its "revolutionary" big data platform, SAP HANA, for years. While its claims have always been a bit suspect, recent revelations that HANA is riddled with critical security flaws only reinforce the mantra that, when it comes to big data infrastructure, open source is best.
Most other companies get this, even hitherto proprietary giants like IBM. Will SAP get the memo in time to rejigger its approach to big data?
Probably not, which is why SAP customers should probably check out what open source has to offer.
(Almost) everything has open source inside
Even the most proprietary software generally has open source inside. That's why Gartner analyst Martin Kihn can declare with utmost assurance that everything—everything—is open source now, to some degree:
I am willing to guess you—yes, you—would be shocked if you really understood to what extent that whizzy piece of expensive cloud software you're using actually (deep, deep in its soul) was running on absolutely free, not-developed-here, open source technology that you—yes, you—could probably bang into something almost as useful if you only knew how to do it.
It's also why you really, really shouldn't be futzing with SAP HANA, or any other proprietary data infrastructure that tries to go it alone without the aid of the open source community. Cloudera chief strategy officer Mike Olson perhaps said it best:
There's been a stunning and irreversible trend in enterprise infrastructure. No dominant platform-level software infrastructure has emerged in the last 10 years in closed-source, proprietary form.
Which brings us to...HANA.
Big data, big problems
HANA, SAP's big data darling, has been the subject of controversy for some time. For years Wall Street analysts like Peter Goldmacher (formerly of Cowen & Co.) have criticized SAP's financial treatment of HANA, arguing that the legacy software vendor had been misrepresenting HANA revenue to project an "inflated growth rate."
In short, he and others argued HANA's zero-to-$1 billion rapid growth story was "highly, highly unlikely."
But wait! It gets worse.
That's because security firm Onapsis just uncovered 21 significant security flaws in HANA, eight of which it deemed "critical."
How critical? Unless companies act to change system configurations, "Unauthenticated attackers could take full control of vulnerable SAP HANA systems, including stealing, deleting or changing business information, as well as taking the platform offline to disrupt key business processes."
Not a cheery thought.
And, it's why Host Analytics (and data infrastructure expert) Dave Kellogg advises HANA customers to switch to "standard infrastructure," in part because it's "more proven."
Peaceful coexistence...for now
Like, for example, Apache Spark.
Of course, Spark-sponsor Databricks will be quick to say that SAP HANA and Spark are complementary, that the one is great for analyzing legacy enterprise data stuck in a CRM or ERP system while the latter handles...pretty much everything else.
This is true, but maybe not relevant. At least, not for long.
After all, as Kamlesh Barvalia, business intelligence and analytics Leader at GE, argued, there is "a great deal of overlap" between the two in terms of features and use cases, and many (like he) will "bet on Spark for the long haul."
Why? Because Spark is open source (so "you do not run the risk of getting yourself trapped in proprietary development platforms" like HANA), cheaper, and "There is a great deal of momentum behind Spark and it appears that the feature overlap as well as breadth and depth of offering will only increase as the time goes by."
Stated in pithier fashion, "What Dave Kellogg said."
Spark isn't the only open source challenge to HANA's alleged momentum (barely ahead of Informix in terms of overall popularity). Given the pace at which the open source community keeps leapfrogging itself with better and better data infrastructure, hatched and released by companies like Google, Facebook, and LinkedIn, who manage scale and speed that even SAP can hardly fathom, this is the open source community's market to lose.
But it won't, for all the reasons Mike Olson called out in his post. Ultimately, all data infrastructure will be open, or it will be irrelevant.
- Manage complex big data pipeline challenges with these approaches (TechRepublic)
- Microsoft, Oracle, AWS are the top database leaders, according to Gartner (TechRepublic)
- Three reasons you need to run Spark in the cloud (TechRepublic)
- Apache Spark is doomed (TechRepublic)
Matt is currently head of the developer ecosystem at Adobe. The views expressed are his own, not those of his employer.
Matt Asay is a veteran technology columnist who has written for CNET, ReadWrite, and other tech media. Asay has also held a variety of executive roles with leading mobile and big data software companies.