The concept of the Internet of Things relates to the ever growing number of objects that, in addition to containing internal sensors and processors, are also directly connected to the web, streaming their data online. While home automation is probably the "top of mind" application for this concept - the refrigerator that orders milk from the grocery store whenever its running out - the scope is, in fact, much larger. We could have toys that interact with each other independently, offices that automatically order new supplies as needed, without our intervention, even sensors on our clothing and bodies streaming our health data to our doctors in real time. This kind of machine-to-machine (M2M) communication is at the crux of the Internet of Things.
For the full potential of the Internet of Things to be realized, however, cloud computing is fundamental. The whole idea behind the connected objects is that the data they collect is mostly streamed online, so that applications can gather, parse, and act on this data efficiently. Going back to our example refrigerator, it's not the refrigerator itself that orders the milk from the grocery store. The refrigerator streams all of its data, from current grocery levels to your historical rate of consumption, to an application, which reads and parses this data. Then, taking into account other factors, such as your current grocery budget and how long it takes for the milk to arrive at your house, it decides whether to make the purchase. The cloud is the natural home for these applications.
An ocean of data
If all of our day-to-day objects are to receive every kind of sensor imaginable, the sheer number of data points being generated will be staggering. The Internet of Things, therefore, brings with it all the problems related to storing and parsing this data that we're familiar with, except on a much larger scale. And it's not only a matter of volume, but also of the speed at which this data will be generated. Sensors generate much more data and at a much higher rate than most commercial applications.
To handle both the volume and the speed, cloud-based solutions are fundamental. The cloud provides us the ability to dynamically provision storage resources as our needs grow, and to do so in an automated manner, so that human intervention is no longer necessary. It also gives us access to virtual storage, either through cloud database clusters or through virtualized physical storage that can have its capacity adjusted without downtime, as well as access to a huge pool of storage resources, beyond anything we could have locally.
The second problem with all this data is how to process it, a problem that comes in two flavors. The first one is the real time processing of each data point from each different object as it comes in. The second is extracting useful information from the collection of all available data points, and correlating the information from different objects to add real value to the stored data.
While real time processing seems simple enough - receive the data, parse it, do something with it - it actually isn't. Let's go back to the connected refrigerator, and imagine that it sends a data packet with information about what items were removed or stored every time a person opens the door. If we assume that there are somewhere around 2 billion refrigerators in the world and that people open their refrigerator doors four times per day, that's 8 billion data packets per day. That adds up to almost 100 thousand packets per second, on average, which is a lot. To make matters worse, these points would probably be concentrated at certain hours of the day (the morning and the evening, mostly), so that if we were to provision processing capacity based on peak loads, a lot of the infrastructure would go to waste.
Once the real time processing is done, we get into the second problem, of how to extract useful information from the stored data beyond the individual transaction level. It's good for you if your refrigerator automatically orders your groceries, but it's even better for the manufacturer if they know that refrigerators from a certain region are more prone to overheating, or that refrigerators with a certain kind of grocery wear out faster. To extract this kind of information from the stored data, we'll need to leverage all the Big Data solutions we have today (and some that are yet to appear).
The cloud is uniquely well suited to handle both problems. In the first case, cloud computing allows for the dynamic allocation (and de-allocation) of processing resources, enabling an application that needed to parse refrigerator data in real time to handle all the data volumes and to optimize its own infrastructure costs. In the second, the cloud goes hand-in-hand with Big Data solutions, for much the same reasons.
So, while the Internet of Things may change the overall architecture of the cloud, the cloud as we know it today is essential to enable this change. Cloud computing, in the sense of virtualized computing resources that can be dynamically allocated by applications themselves, without human interference, isn't going anywhere. The Internet of Things will only make it grow.
After working for a database company for 8 years, Thoran Rodrigues took the opportunity to open a cloud services company. For two years his company has been providing services for several of the largest e-commerce companies in Brazil, and over this time he had the opportunity to work on large scale projects ranging from data retrieval to high-availability critical services.