Cloud

Costs and risks to consider when planning a move to the public cloud

If you're considering a move to the public cloud, you need to conduct a risk and cost analysis vs. that of creating your own private cloud. Here, Colin Smith gives you some points to consider and an example of one such calculation.

Amazon's EC2 has set the bar for VM/hour costs in the range of $0.05 for a small (2GB) reserved instance. This is the benchmark that internal IT organizations will need to compare against. How does your organization measure up? Do you have an internal cost per VM hour that you can meaningfully compare to EC2? In this post, I will try to compare the cost savings associated with a public IaaS cloud and the value-add of internal IT. (Note: I used EC2 for my analysis but a similar analysis can be made of other cloud providers.)

In trying to understand the value proposition of internal IT compared to a public cloud I wanted to compare like to like. Since the charging metric used by public clouds is cost per VM hour, I looked for industry figures using that metric but I couldn't find anything authoritative. So I took it upon myself to create a working number based on conservative estimates and a little over inflation so that whatever I come up with is at least in the ballpark if not lower than the actual cost per server hour for the typical enterprise. Here's how I came up with $0.25 per server hour:

1.       I used the Rackspace dedicated hosted server basic configuration as an analog to the traditional data center server. The cost for a 3.5GB hosted server is $419 per month. That works out to about $0.58 per hour.

2.       Let's assume a 20% margin on the part of Rackspace and 20% management efficiency on the part of the on premise IT department and we come up with an hourly cost of $0.35 per hour.

3.       Now the Rackspace server is a little beefier than the EC2 small instance so let's shave off another $0.04 per hour (11%).

4.       That leaves us with a cost of $0.31 per hour.

5.       Now I've probably missed something so let's use a margin of error of 20% and we end up with $0.25 per hour as a conservative estimate.

So now we are comparing a public cloud IaaS offering of $0.05/VM hour to $0.25 per physical hour gives us a 500% cost differential. How does the IT department address the cost gap?

Lowering costs

First of all, we are comparing VM hours with physical server hours. We need to adjust that. Some analysts claim server utilization rates in the 4% (Data Centers Only Operating at 4% Utilization) to 7% (Kundra: Fed Data Centers 7 Percent Utilized) range, but according to Gartner, traditional data center server utilization is between 15-20%. If virtualization can get that number up to 40%, then we are looking at $0.125 per hour, assuming no increased costs associated with virtualizing.

Adopting virtualization technologies is a great first step in lowering costs but we're still at more than double the public cloud price. What else can the IT department do? Most organizations of significant size will introduce management tools to reduce overall management costs and increase the reliability of the data center.

So perhaps that reduces the cost by another 20%, bringing our internal costs down to $0.10 per VM hour.

Business requirements

Not bad, but still double the public cloud price. Can you justify the higher cost to your CIO or CFO? This is where understanding your organization's business comes in to play. What are the non-cost issues that you need to be aware of? What about some of the following:

SLAs: EC2 offers 99.95% uptime guarantee but what happens when your servers go offline for an extended period of time - remember last year when lightening hit one of Amazon's data centers and some clients were down for four hours? You get a credit on your invoice for the following month if you bother to ask for it. How do the EC2 data center operations staff prioritize which of the thousands of servers need to come up first? What if your SQL server comes up before your Domain Controller? Security: Data Leak? Data Ownership? Jurisdiction? How well screened and controlled are the cloud providers staff (remember David Barksdale)? What type of perimeter security do they have? What rights do you give up when your data is stored on another organization's hardware? What procedures do they have in place when they dispose of hard disks? What happens in a multi-tenant cloud when one tenant is investigated by authorities like DHS or the IRS and the all of the hardware that they had information on is seized? Is there a guarantee that your data will not leave a particular jurisdiction? Regulatory compliance: SOX, PCI, PIPEDA, HIPAA, GLBA, etc. Can you maintain compliance with a key system in the public cloud? Others: Are your existing licenses portable to the cloud? What happens if your cloud provider goes out of business?

Are these risks ones that your organization is willing to accept? Once you have a handle on your business requirements, you might just find that it makes sense to move some applications to a public cloud. Perhaps training servers, demo environments, or UAT are a good fit.

Conclusion

I've highlighted just a few of the areas where on-premises IT can add real value to justify costs higher than the public cloud. Couple this with strategies to lower operating costs (like virtualization, high availability, self-service portals, rich management tools) and it starts to look like your organization has taken steps towards creating a private cloud. The actual costs for your organization may be different but I think the message is clear: Enterprise IT's best response to the public cloud is to start planning for a private cloud!

About

Colin Smith is a Microsoft SCCM MVP who has been working with SMS since version 1.0. He has over 20 years of experience deploying Microsoft-based solutions for the private and public sector with a focus on desktop and data center management.

12 comments
davidsmi
davidsmi

Great blog Colin! In the organizations I contract for the staffing costs is significantly more than the cost of the hardware. That is saying a lot since Servers, SANs and backup infrastructure are really expensive. If cloud computing can lower server costs - great - but 10% savings on 25% of an IT budget is only 2.5% of the IT budget. How can Cloud save staffing costs, does it call for re-writing of custom applications, re-purchasing of COTS applications? Why do organizations pay hundreds of dollars for in internal email account when they can get the exact same service externally for much less (even free).

dagblakstad
dagblakstad

This is comparing apples to oranges. The main benefit of an elastic cloud is extremely low startup costs and ability to scale horizontally in a very short time. This is very important when you have periodical needs or when traffic rates are unpredictable. And for the security and reliability: What makes you think your staff is more honest than a cloud computing provider. Lightning can strike anyone, and clouds probably is better backed up with power supply and alternative network routes in case of disasters then 90% of company data centers. Price is a simple calculation of difference, and is pretty easy to figure out. Comparing apples and oranges is not.

tom
tom

Wow, that is some of the most convoluted thinking I've seen n years even if I only consider what's missing from it. Be careful, guys.

jascc1
jascc1

Good Post - thanks for the calculations to use as a framework. I personally feel the same way about putting 'important' data dependent stuff out on the cloud. My organization is considering Microsoft's BPOS, which is not hardware, just SAAS, but Even though BPOS is "Microsoft" so reliability and secuirity concerns may wane a little, the same concerns are there (E-mail, SharePoint data, etc). Not to mention the bottleneck/dependency of the Internet in general. You better plan for a second back-up data line/carrier (cable?) on your internal network as failover - which in turn means there are costs there as well. I personally rather give it 5 more years before considering critical data, which is why development, training, UAT environments would make more sense.

jascc1
jascc1

You have to take all of the 'logical' tangibles (server, data storage, etc.) and intangibles (existing man hours, performance, scaliblity, security etc.) into consideration when discussing a potential enterprise level IT change such as moving to a public cloud. Certainly worth putting into the discussion.

jascc1
jascc1

I thought it was a decent post, but I'm curious about your concerns so I can indeed 'be careful' :)

jesavro
jesavro

How about the benefit of being able to launch multiple instances within minutes for running specific jobs and then being able to shut them down without paying for a full month or setup fee?

TAPhilo
TAPhilo

Now that all servers are in the cloud, your outgoing / incoming internet connection has GOT to have two homed failover and be beefly to handle the data to an from your hosted system. Afterall, if some user asks for a 50,000 row return from a server 20 times a day its got to get down to their desktop in 20 seconds just like it did when hosted locally! Now if you host EVERYTHING in the cloud, and only displays come back to a local workstation you are fine - but you still need the failover dual homed network in and out connections (backhoes have a way of hitting only 1 cable at a time). Course now all your testing HAS to be done on the virtual hosts - no touchy feelly local consoles anymore - and if you tell it "shutdown" (default) instead of "restart" instead of a walk to the computer room you have a trouble call and a wait for someone to get it back online. Now if a catastrophic outage occurs at their datacenter, and you do not have to have an alternate (and YOU must update all your systems) what then (see the recent outage in Australia of the airlines reservation center) you now need to have even more detailed and tested recovery plans than what you would have locally - since you cannot actually see or touch ANYTHING anymore. It causes even a great shop to need to hire even more highly technial people to run and fix things remotely hosted. These are types of things that are always labed "out of scope" when people do comparisons between hosted and non-hosted business critical systems . . .

The Colin Smith
The Colin Smith

good point - I conveniently ignored attempting that calculation because 1) I couldn't find an industry average that was easy to work with (3-5 years seems typical) 2) I considered it to be built into the Rackspace dedicated server price since they would replace the server on their schedule and factor that into their monthly cost.

techrep1000
techrep1000

The Title of the post is "COSTS and RISKS ..." Not "BENEFITS and ADVANTAGES...". If you want a list of benefits go to the amazon ec2 site. I'm sure you'll find some unbiased ones there. You may not agree with his calulations of costs (I have some quiblles). I agree with the risks outlined. It's a good start but way to meagre for a serious discussion.

jhu0596
jhu0596

Most dedicated servers have lights out management, i.e. Dell servers with Enterprise DRACs (Dell Remote Access Controller). These allow you power on and off the server remote. The DRAC also give you console redirection and virtual media capability. So if the server hangs before the network services are started, i.e. due to file system corruption. You almost have the same level of access as you would on a physical server. You could mount your rescue disk as virtual media. The major drawback is dealing with physical hardware problems. Perhaps you need to reseat/remove/replace components to get the server back into production. I personally believe it is important to maintain geographic separation between the backup media and the device being backed up. FYI having a business continuity plan which includes offsite backups is crucial whether the physical hardware is local or in a data center. Several years ago, I was a system admin for a medium size manufacturer. Around 11 am one of the shop guys came running into the office saying the ovens on fire everyone get out. There was only a metal wall separating my office and the servers from the manufacturing floor. I had the tapes for the nightly backup in my laptop bag. I normally changed the tapes at the end of the day and took the previous nights backup to the safety deposit box at the bank. I was outside waiting in the parking lot for about 2 minutes when the thought crossed my mind, if I don't get the tapes that are in the servers, I will lose a whole days worth of data. So I ran back into the office. Pressed the eject button on the three tape drives. They were going whir click whir click while the fire on the other side of the wall sound like popping popcorn. If the popping would have been more frequent or louder I would have probably left before the tapes ejected. I returned to the parking lot. My boss told me that I didn't have to do that but he was glad I did. There were a number of best practices that I implemented that made it much easier to pick up the pieces after the fire. Multihomed network connectivity is also very important whether or not your servers are at the data center or local. Bean counters may give you some grief about paying for two Internet connections. The easiest way to handle this is calculate the cost to the business when Internet connectivity is lost. If your business does a significant amount of business online, it is easy to justify adding a second connection. Depending upon your IT budget, it may be justifiable to have both local servers and ones in a data center. One would be used as primary servers, the other would be used as a disaster recovery site with the two synced to reduce/eliminate downtime.