While most discussions about cloud computing center on technology, software, or whatever exciting new service happens to be in the spotlight, a lot of the times people forget about one of the most basic elements of the cloud: service level agreements. Since moving to the cloud means looking at computing from a service perspective, the first concern on the minds of both customers and companies should be the service level agreement, and not the fancy technological elements.
Think of cloud computing in terms of a relationship: if before, the cloud customers and companies had long distance relationships - with brief but passionate encounters of deployment followed by long periods of distance waiting for new versions - on the cloud everyone is living together, with no time apart. In this reality, the most important thing is to avoid annoying others so much that they move out. The service level agreement is the key to this, establishing the ground rules as to what may or may not be done and to the level of annoyance each party will tolerate.
An important thing to realize is that the perception of quality in a service is relative to the needs of customers. I don't mind if the electricity doesn't work in my home while I am at work, but if the power goes out when I am sitting down to watch some TV, I will hate the power company. The same goes for cloud services. If you give 99% availability, but the 1% failure always happens in the middle of the business day, customers will quickly abandon your service, despite it being within the agreed level.
The flip side of this coin is that most of us don't really need that 99.995% availability; we need 100% availability whenever we decide to use the service. Since most usage habits are pretty predictable, a company offering a cloud service (especially software) can optimize itself so that offline periods fall outside peak usage times. The cloud enables companies to easily and closely monitor usage patterns, so if you find out that people only use your cloud-based solution on the first five business days of each month during business hours (this could be a payroll application, for instance), you better make sure that your servers can handle the peak load and that you have round-the-clock support during this time. At the same time, you can probably save money (and pass these savings on to clients) by reducing your capacity and having less support people the rest of the month.
Another very important thing to remember is that a service level agreement is only as useful as the capacity of the user to monitor it. Transparency is key here. Whoever is using a service must have a simple way not only of checking if the service is online of offline (like flipping a switch to check if the power is on), but also to monitor whatever metric was established on the agreement. If you are building a cloud service, remember to include a "control panel" so that your customers, the press, and even the competition can quickly see your status. Remember always that transparency creates tolerance: the most stressful thing about a traffic jam is not knowing what is going on and for how long it stretches.So, if you are building a cloud service, or moving your company's existing software to the cloud, this is my suggestion: think first about the service level your customers want and you can offer, and build your new service around it. Don't simply say, "I'm on the cloud, I'll be available 24x7" and avoid frustrations on both sides. If you are purchasing a cloud service, make sure that you are not demanding nor paying for more availability than you actually need. If you have ever come across any unusual service agreements, or have been victimized by one, please share your experience. Also, if you have suggestions about how companies may improve their service agreements or the transparency issue, let me know.
After working for a database company for 8 years, Thoran Rodrigues took the opportunity to open a cloud services company. For two years his company has been providing services for several of the largest e-commerce companies in Brazil, and over this time he had the opportunity to work on large scale projects ranging from data retrieval to high-availability critical services.