This week, I had the chance to go to Las Vegas to attend the re:Invent Conference, the first-ever developer-focused conference ran by Amazon. The focus of this conference was, obviously, Amazon Web Services (AWS): the existing services, customer success stories, best practices, and so on. The sheer size and scope of it gives a good indication of the size and complexity of AWS, and shows just how far the platform has evolved since its launch in 2006.
In addition to new products and other announcements, which we'll go into more detail shortly, some interesting and impressive facts were thrown out to the audience over the course of the keynotes and sessions. First, today they are adding to their data centers, every day, more than the server capacity they needed to run their entire business back in 2003. You know, when it was "only" a US$ 5 billion business. This is scale on a level that's almost impossible to match, that anyone can leverage for their business needs.
A second interesting fact is the ramp-up of new product / feature releases that has been going on over the past few years. They have gone from nine releases in 2007, their first full year in operation, to 82 releases in 2011, and they expect to end 2012 with over 150 new releases. I believe that this increasing pace of releases and features illustrates the rapid evolution of the cloud market as a whole, and also shows the tremendous complexity of AWS.
A final statistic of note was given on the second keynote by Amazon CTO Werner Vogels. He showed that if Amazon were to use the rule of thumb of provisioning 15% more capacity than what they need for peak loads, they would be "wasting" about 39% of their server capacity; by looking at a month such as November, this wastage would be even greater, at 76%. This highlights the benefits of cloud scalability, especially if done in an automated fashion.
One of the major announcements made during the conference was a roughly 25% reduction in Amazon S3 (storage) prices. It's roughly25% because the actual percentage varies a little across different regions. Over the course of AWS's existence, they have reduced prices for some service or another over 25 different times. While no one is foolish enough to believe that they are doing this out of the kindness of their hearts, the fact is that not only are these reductions good for customers, they are only possible due to growing economies of scale that they can achieve.
These price reductions are good not only for current Amazon customers, who will see immediate benefits from it, but for the cloud market as a whole. As the price of storage (or any other service) drops on Amazon, other players in the market see themselves forced to reduce their prices as well. This is even truer for cloud services that have very low switching costs for customers, and storage definitely falls under this category.
The second major announcement made during the conference was regarding new services that will be offered by AWS. The first one is Amazon Redshift, which is essentially a data warehouse on the cloud. Amazon is leveraging all of the best, most efficient data warehousing technologies out there - column-oriented data storage, compression, MPP architecture - to deliver very impressive performance. It's also leveraging all of its cloud know-how to do this at a very, very low price point. The smallest Redshift instance type, which has 2 CPUs, 4 GB of RAM and 2 TB of storage, runs at $ 0.85 / hour, or just under $ 7,500.00 per year.
When compared with the price point of conventional data warehousing solutions, this would be 10 or more times cheaper than most solutions out there today. Which once again leaves the rest of the market to consider reducing prices in order to compete with Amazon, and benefits everyone who will purchase a data warehousing solution, not only Amazon customers. You can find more information and sign up for the public beta of Redshift here.
The other service announcement made by Amazon was AWS Data Pipeline, which is a pipeline (or workflow), that allows you to run and schedule future executions of data-related jobs, from backing-up data from one location to another to running analytics on datasets in order to generate reports and aggregations, and everything in between. In addition to the API for interacting with the service, Amazon has also created a visual pipeline creation tool that allows any user to simply drag-and-drop data sources and activities to create their own pipeline.
With these two services, Amazon is making a clear move towards enterprises, and trying to show more than ever that the cloud isn't simply for small businesses and internet companies. Their value proposition is very interesting and attractive, and as it gains traction amongst large businesses, it will be interesting to see how the traditional data warehousing companies will react. It is, as it has been for the past year, a very interesting time for the cloud computing market.
After working for a database company for 8 years, Thoran Rodrigues took the opportunity to open a cloud services company. For two years his company has been providing services for several of the largest e-commerce companies in Brazil, and over this time he had the opportunity to work on large scale projects ranging from data retrieval to high-availability critical services.