Cloud

Four solutions to managing open source server software

Setting up servers in the cloud is easy, but securely installing and configuring the software isn't. Here are some tools that can help.

You must be nimble in adapting your software to the online space. Businesses of all sizes are beginning to see massive amounts of traffic and work with large amounts of data spread across multiple servers. This can cause a host of problems. The issue is complicated, but don't fret. There are several solutions available.

For companies that have traditionally used proprietary commercial software, it's time to embrace open source software used at companies like Netflix, Facebook, and LinkedIn. Any company can now run the same software as these industry giants - without paying a dime in upfront costs and software licensing fees. This is quite the change from the first dot-com boom, when companies were investing hundreds of thousands of dollars in infrastructure before they even launched a product.

Armed with just a credit card and 15 minutes, you can spin up a farm of 100 servers at Amazon Web Services (AWS) or Rackspace Cloud. Open source software has matured to the point where you can download and install an enterprise-level database server with a single command.

Server switch-up

Setting up servers might be much easier with services from Amazon and the like, but how do you go from there to having your software securely installed and running in the cloud? Here are four solutions to explore when getting started:

  1. Platform as a Service (PaaS): A new breed of hosting providers have cropped up that will manage much of this hassle. Here's how it works: You upload your code and tell it which services you need. This allows you to focus on your software instead of the infrastructure. But keep in mind that PaaS services are significantly more expensive than running your own servers, and they often provide less flexibility in the software you're able to run. Heroku appears to be the leader in this space at the moment, but services like dotCloud and Google App Engine also have compelling offerings.
  2. Configuration Management Tools: With the ephemeral nature of cloud servers, it's common to spin up multiple servers in a few hours, and then tear them all down to save on costs. Configuration management (CM) lets you script this entire process in a descriptive language; it requires little to no manual effort after the initial setup. These CM tools can push out configuration and software updates to a fleet of servers simultaneously. However, they often require significant effort upfront during the initial setup, as well as ongoing tweaking. Chef and Puppet are popular tools here, but we're most excited about the newcomer, Salt, which provides tons of functionality beyond the existing configuration management tools.
  3. Third-Party Services: Much of the systems' logging and alerting can be handled by third-party services. In fact, the argument could be made that third-party providers are a better option than doing the work in-house. Popular companies in this space include New Relic, Sentry, Pingdom and PagerDuty.
  4. Third-Party Consultants: The initial learning curve for all these tools is steep. Not only do you need to learn how to use the tools, but you also need to learn which tools you need. For many companies, it makes sense to bring on experts who are well-versed in this technology for the initial setup and training.

Connection complications

Complications with your servers can be catastrophic if left unattended. We're seeing computing problems on a scale that we haven't needed to tackle in the past. On the Internet, a single website might serve millions of page views in a day. You can't handle this sort of traffic with a single machine. Companies like Facebook and Google maintain hundreds of thousands of servers to handle the massive amounts of traffic and data they see daily.

That issue is only exacerbated as the Web becomes more real-time. In order to update your browser with Twitter and Facebook updates, your computer is either holding open a long-running connection to the server or is constantly opening up connections to ask for new data. This is at the crux of the C10k problem that asks: How do we get a server to handle 10,000 connections simultaneously? While some people have blown by the 10,000 number (Urban Airship does more than 500,000), there's still an upper limit, and it's less than the number of users we need to support on a high-traffic website.

Managing the configuration and deployment of tech across multiple servers is an overwhelming problem at first. One misstep might mean a big security hole or taking down an entire website. Spending the time and resources to build a strong infrastructure upfront is a necessity and equivalent to the old adage that an ounce of prevention is worth a pound of cure. By doing some deliberate planning early on, you can spend more time investing in your product down the road.

Peter Baumgartner is the founder of full-service web studio Lincoln Loop, makers of Ginger, an online platform to help distributed teams communicate. Peter is an expert in Django-based web development and a thought leader in entrepreneurship and remote teamwork. He welcomes anyone to reach out to him on Twitter or Google+.
0 comments