Ian Hardenburgh introduces the Google Prediction API, a cloud-based set of machine-learning tools that can help you analyze unstructured data.
In response to the explosion of growth seen with social networking and user review applications, and the yearning to understand the kind of opinions being shared across them, big data has become an important issue for technology. The rise of big data has afforded data scientists with vast sets of data to which they can apply advanced mining techniques, based on complex algorithms such as those used with sentiment analysis and predictive modeling techniques. However, at the same time, these types of exercises are extremely difficult to learn, develop, and disseminate across an enterprise.
Given how social networking and messaging pervades everyday life, it is without question that almost all organizations can benefit from machine learning or advanced business data analytics. However, there are a number of enterprises, even of relatively large size, that simply don't have the required talent, or don't want to break the bank on some exorbitant solution (e.g., IBM's Smart Computing infrastructure) that can potentially add more cost than value. This is where the Google Prediction API hopes to come in.
A cloud-based pattern matching and machine learning tool, the Google Prediction API provides ordinary users with the means to perform sophisticated data analysis (customer sentiment analysis, churn analysis, upsell opportunity analysis, etc.) as well as recommendations and intelligent routing (i.e., e-mail classification) systems. It is powerful, yet simple enough to tackle most Internet-facing applications.
The Google Prediction API makes use of something called classifiers that do most of the heavy lifting for you, when it comes to programming the service to make "predictions". Therefore, no working knowledge of complex artificial intelligence type algorithms is necessary-although some background in programming will be helpful. Furthermore, no software is required, as the Prediction API is accessed through a RESTful interface.
To get a true sense for how one goes about using the Google Prediction API, it's best to start off by following Google's Hello Predication application tutorial. But first, you must sign up for the service using any kind of Google account (doesn't free or paid), by going to the API Console. Then, follow the easy to understand instructions found here.
Just to show how easy the (Hello Prediction) Google Prediction API application building process is, one simply creates a new API project, activates Google Cloud Storage and the Prediction API by toggling a couple of buttons, and enables billing using one's Google Checkout account. Then, one uploads sample data provided by Google to a Google Cloud Storage bucket, turns on the Prediction API via Google APIs Explorer, and starts to train his/her model by selecting the appropriate training method. You have to assign an ID to your model, and point to the Google Cloud Storage bucket that contains the prediction data. Lastly, to use the method, declare the model ID and invoke the model by sending a parametric query that uses the aforementioned Google Cloud Storage data that is to be evaluated and classified.
It's noteworthy to mention that the running the Hello Prediction application will not cost you anything, as there is a free quota (see pricing details here). Additionally, Google Cloud Storage also includes a very generous free quota, which will be nearly impossible to surpass, even if you run the Hello Prediction application continuously throughout the day, given that you are using the sample data alone, of course. However, one should be wary of calling the Prediction API too often, as it has a cap of 100 predictions per day, as well as a lifetime cap of 20,000 predictions. However, as a precaution, you can always disable billing.
Although the applications or predictive models you can create with the Google Prediction API are virtually endless, Google will continually be expanding upon new features, code libraries, and its underlying predictive capabilities. In the meantime, I've compiled some links to resources, in order to support you in your predictive efforts -- below: