Big Data

Oracle Big Data Appliance: Data so big it's scary?

With the new Oracle Big Data Appliance, Oracle will commercialize NoSQL and Hadoop for cloud-powered big data analytics.

Oracle, which has been slow to transform itself for the coming cloud era, took an important step on Monday at Oracle Openworld 2011 by announcing the Oracle Big Data Appliance, which is built on NoSQL and Hadoop -- key technologies for the future of the cloud and "big data."

Oracle's flagship database is built on a solid old technology called the relational database, but newer web sites and web apps have moved beyond the relational database in order to scale to a global level at much faster speeds. Sites like Twitter, Facebook, and Netflix are examples of the kinds of sites that have had to use NoSQL in order to meet the demands of the rapidly-expanding global web, especially for sites/apps that are more interactive and not just about loading pages.

For Oracle's purposes, it wants to have a NoSQL and Hadoop product to allow businesses to grab unstructured data from across the web and then use it to build powerful new reports. Oracle will also provide a bridge to its other products so that this unstructured data can be combined with structured data in a traditional Oracle database to provide businesses with the ability to do reports and real-time business analytics that are based on streams of both structured data and unstructured data.

So what does that look like in the real world? Interestingly enough, in an earlier keynote from Oracle partner EMC, who is working on a similar product (albeit from a software perspective), provided a perfect example. In this case, it was an auto insurance company that is using big data to set better rates for its customers. Analyzing big data has showed that the vast majority of customers are safe and they are subsidizing the cost of a small sample of really bad drivers. So, this insurance company could use this big data do two things to change the rates of its customers and save the majority of its customers more money (and, conversely, make bad drivers pay more of their own way).

The big data software for this insurance company could set the standard rate for the customer and then provide a discount (or penalty) based on more thorough data analysis. The first analysis would be based on structured data (driving record, legal record, credit score, etc.). The second analysis could be based on an unstructured source of data such as the person's social graph (Twitter stream, YouTube views, etc.). People that do a lot of parental stuff on their social graph would likely get a discount, while those whose social graph is full of thrill-seeking activity would likely get a penalty.

Your social graph having a financial impact on you personally may sound a little scary -- and let's be clear that this example is only conceptual at this point -- but everyone should be aware that this is the kind of thing that companies are going to be able to do in the future. This shows how businesses will soon be able mine public data with products like the Oracle Big Data Appliance. You can already do much of this now by hacking something together with NoSQL and Hadoop, but Oracle is ready to commercialize it in a big way.

Here's how Andrew Mendelsohn, senior vice president of Oracle Server Technologies, put the announcement in context for Oracle:

"With the explosion of data in the past decade, including more machine-generated data and social data, companies are faced with the challenge of acquiring, organizing and analyzing this data to make better business decisions. New technologies, such as Hadoop, offer some relief, but don't provide a holistic solution for customers' Big Data needs. With today's announcement, Oracle becomes the first vendor to offer customers a complete and integrated set of products to address critical Big Data requirements, unlock efficiencies, simplify management and create data insights that maximize business value."

Also read


Jason Hiner is Editor in Chief of TechRepublic and Long Form Editor of ZDNet. He writes about the people, products, and ideas changing how we live and work in the 21st century. He's co-author of the book, Follow the Geeks.


So, one needs to bare their soul and hope for best rates? Am I missing something?


Why the word? When you want to find a certain solution for certain problem, this term comes to mind and is overused by major companies. Yes Oracle latest servers look impresive, but when it comes to a solution they have to be aware that even with their "cheap" solution there may be some other even low cost solutions. Hadoop is a tool that used and applied with all the details need by a customer is very powerfull and like this exist many other cloud solutions for this kind of heavy and big data. In my humble opinion theres is no holistic solution when it comes to find and implement solutions for big data orientated tools. Is very soon to set a standar and most of all this is no ISO enviroment, you have to be alert and creative with this not so new genre of IT.


I was interested in the reference to a "social graph" and the minning of social data. I do agree that it's the way forward. I would be interested to read a more detailed atricle on how you see that happening, what the practicalities are facing organisations (e.g. data cleansing as pointed out by Brian J. Bartlett) and what it means on a personal level. You should check out the startup Lenddo who are doing social media based reputation management. ps I have no involvement with this business, but it does seem to relate to your article.

Brian J. Bartlett
Brian J. Bartlett

Unless the Big Data Appliance comes pre-integrated with their Exadata platform, I don't see this helping "unlock efficiencies, simplify management and create data insights that maximize business value." At least with the example given. Instead we have another isolated data warehouse. Furthermore, having looked at the various Oracle "solution's" you can do better with the right people. Even if you do buy Oracle, you'll still need the people. At best, Oracle will save you some time at a rather high cost and I still have reservations about the time element. You simply can not buy expertise in the form of hardware and software when it comes to analytics (a field I started working professionally at the tender age of 14). BTW, your insurance example will suffer from all the usual data-cleansing issues that are faced in the unstructured world. The identified social graph from the data collection, ETL, and integration process may, or may not, be accurate. Given the "the state of the art", I'd lay a bet that someone is going to litigate in the not to distant future around this. One need only look at the TSA's "Do Not Fly" list to see an egregious example. At least that investment in more big iron will make the discovery process easier and faster!

Editor's Picks