CXO

Achieving scalability for your Web infrastructure

Hardware is one of the key factors in building a scalable Web site infrastructure. But whether you are scaling up or scaling out, TechRepublic contributing author Kevin Brown has some suggestions to help you plan your implementation.


Two of the most critical concerns for your Web site are availability and scalability. Achieving them is an ongoing process that involves three key areas of design: hardware, software, and procedures. In this article, we’ll focus our attention on hardware issues.

Advertisement
IBM Corporation is the exclusive sponsor of TechRepublic's special series on Web Hosting. IBM's e-business Hosting gives you the freedom to customize an array of services into a solution that is shaped by your business, not ours. For more information, check out TechRepublic's Web Hosting Center, or visit IBM's e-business Hosting site

IBM Corporation is the exclusive sponsor of TechRepublic's special series on Web Hosting. IBM's e-business Hosting gives you the freedom to customize an array of services into a solution that is shaped by your business, not ours. For more information, check out TechRepublic's Web Hosting Center, or visit IBM's e-business Hosting site

When considering hardware, you need to understand the difference between scaling out and scaling up. Scaling out is the process of adding more equipment to handle larger loads on the system. Scaling up is the addition of components, such as more RAM, hard drives, and network cards, to handle larger loads on the system. Scaling up faces many limitations in terms of handling more traffic; however, scaling out offers infinite possibilities for adding to your existing system.

Hosting facilities
So what types of hardware are needed for a truly robust Web site architecture? Start with the hosting facility. Even the largest Internet companies today use a hosting service to supply the necessary foundation for a successful Web site.

Here are some items to consider when choosing a provider:
  • The facility should offer fully redundant systems all the way to your rack, starting with power.
  • The line(s) into your rack should be connected to redundant routers to ensure an equipment outage does not affect availability.
  • There should be redundant lines out of the facility with different providers if the provider doesn’t have a private network. (Usually it is better if these data lines are on different sides of the building.)
  • Fire protection and security are essential for the hosting facility.

You should also inquire with the hosting provider about what precautions are taken to prevent denial of service attacks. Such attacks can be very difficult to circumvent without assistance from the hosting provider.
TechRepublic is featuring a series of articles on this topic in every republic this month. If you’d like to see what your IT colleagues are doing with Web hosting, click here.
Hardware
The next factor to consider is the supporting hardware for the rack. The assumption thus far has been for a single rack of space. (If the installation already requires space beyond one rack, then most likely you’ve addressed these issues.) First, work with the selected hosting provider to see if a UPC is recommended for the rack. Some providers recommend that customers supply their own UPC, but others guarantee uninterruptible power as part of the hosting fee. Once the power supply has been arranged, make sure there is enough power for all of the equipment. You won’t know the power requirements until all of the equipment for the rack is finalized, but keep it in mind. Also, consider a KVM switch, so that one monitor and keyboard can operate all the boxes in the rack. KVM switches range from simple, inexpensive units to costly units with remote management capabilities.

Firewalls
The next piece of equipment for the rack is some type of firewall. A firewall is critical to ensure that traffic coming into the Web site is not the wrong type of traffic. The firewall should offer complete control over the various ports and allow connections on those ports from designated locations. Personnel managing the site will probably need Telnet and FTP access to the machines, but the typical visitor may not require access to these ports. A number of manufacturers (3Com, Cisco, F5 Networks, and so on) offer firewall products suitable for a Web site. Most likely, the firewall functionality can be combined with load balancing to ensure that requests are always routed to the most available server.

Switches
Next for the rack is a switch. A dependable switch with dedicated bandwidth for each port is essential in order to allow the Web site to scale. Keep in mind that it is necessary to supply redundancy for these components, either within the rack or by duplicating the entire rack. Look for switches that offer manageability and port level monitoring so that bottlenecks in network traffic can easily be tracked.

Which Web server?
Finally, it is time to consider choices for the Web servers. No matter what operating system you choose, each Web server should have sufficient RAM for the Web server software to cache the pages of the Web site. For most implementations, 512 megabytes in each Web server is usually an appropriate amount of RAM.

Consider using the 1U form factor units. These machines are thin and allow more servers in the rack than typical units. A 1U Web server from Dell, Compaq, or Intel should not cost more than $3,000 with a single processor and two network interface cards. These units make great Web servers since they are inexpensive. With proper load balancing in place, a bad unit can simply be removed and replaced with another. But don’t invest in hardware RAID or duplicate power supplies in these units. It’s important to treat each Web server as a replaceable part. Purchasing redundancy within the unit assumes an attempt to scale up rather than scale out and only serves to make the units more expensive.

Database server considerations
Most Web sites today are not just static HTML pages but dynamic data-driven sites that require a back-end database server. The ultimate in scalability and reliability with database servers is clustering. While clustering solutions offer redundancy and scalability, they are very expensive to purchase, complicated to set up, and require a good deal of maintenance. If a cluster solution is within the budget, it is highly recommended that you make this choice early.

For more limited budgets, consider a database box that will scale up. Purchase a box with hardware RAID (level 1 and level 5), hot-swappable drives, a large amount of RAM (1 gig to start), one to four processors, and two NICs. Today’s database software makes good use of multiple processors and RAM. More memory to buffer the database means fewer trips to the hard drive to retrieve the data required. No matter how much RAM you have, however, you will still need to purchase a fast disk system like SCSI to ensure that the database doesn’t suffer a bottleneck in the disk I/O.

Purchase a second database machine of similar configuration for use as a standby server. While this doesn’t supply automatic fail-over for the database, it does keep a database server ready to take over in the event of failure. With proper monitoring and procedures, support personnel can be prepared to switch over to the standby database server and suffer minimal downtime. If even a small amount of downtime is unacceptable, a cluster solution is necessary with the larger price tag such a solution requires.

Connection options
Connect the Web server’s primary NIC to the main switch to serve Web pages to the public Internet. The firewall product will translate the local IP address for each machine to the appropriate external IP for the Web site. Connect the second NIC to another switch to handle communication with your database. This way, the traffic for incoming HTTP requests and outgoing HTTP responses can occur on a different channel from the database requests and responses generated by the Web server application.

For the database servers, use the second NIC for communications between the two machines. As data is replicated to the standby database, it can occur on the second NIC and avoid tying up the primary NIC that is being used for communication with the Web servers.

The NICs and switches should be 100 megabit or even the new gigabit Ethernet. Other technologies, such as ATM, exist but are generally expensive and complicated to set up and maintain. Once again, if budget allows implementation of these technologies, give them some consideration, since they usually offer significant performance gains over traditional Ethernet technologies.

Summary
Work with the hosting provider as equipment choices are made. Many decisions will be influenced by the requirements of the provider. For instance, some providers may require that all customers use Cisco routers. The hosting provider is also a great source of information about how to design the network to support the maximum amount of traffic.
We’d like to know about some of the more popular configurations for Web server boxes. Start a discussion below describing the pros and cons of your setup or send the editor an e-mail.

Editor's Picks

Free Newsletters, In your Inbox