One of the great fears for many Web developers, and even more so for Web server admins, is watching their sites brought down by an avalanche of traffic. Learn to stress-test your Web servers with Siege, an open source tool.
One of the great fears for many Web developers, and even more so for Web server admins, is watching their sites brought down by an avalanche of traffic. The traffic may be from a DoS attack, or maybe the site just got slash-dotted. Bottom line: It just isn’t available.
Do you know at what point your own site will collapse? That’s what load/stress testing is all about, and there are a number of great tools to help you do it. A proprietary tool called WAPT was previously reviewed here at Builder.com. But I’m an open source kinda guy and always prefer to use tools that are freely available. One of the most prevalent and well-maintained open source tools for stress testing is aptly called Siege.
What is Siege?
The name says it all—it lays siege to your server so you can see how it’ll hold up. It's a command-line UNIX-based tool licensed under the GNU GPL open source license, which means it’s free to use, modify, and distribute. Siege can stress-test a single URL, or it can read many URLs into memory and stress them simultaneously with a user-definable number of simulated users. The program reports the total number of hits recorded, bytes transferred, response time, concurrency, and return status. Siege supports HTTP/1.0 and 1.1 protocols, GET and POST directives, cookies, transaction logging, and basic authentication.
Getting and installing Siege
You can download the current version of Siege through our Builder Download site. Installation is a relatively standard compilation process for UNIX applications using GNU autoconf. If you’re running a modern Linux (or other *nix) system with a standard ANSI C compiler (part of most default *nix installations), the installation process is pretty straightforward.
First, you’ll need to untar the package:
$ tar xvzf siege-latest.tar.gz
Then you’ll need to configure it; the default configuration is a good start:
Configuration help is available with the -help suffix. The only one that I've personally added on is SSL support through the -with-ssl=/usr/local/ssl suffix.
Next, it’s time to compile and install:
$ make install
Siege has a lot of options for how to "lay siege" against a Web server. The simplest and easiest way to get a feel for the program is to do a test against a single URL. The single URL test is also a good indication of how a particular page will hold up against a slashdot effect or other similar massive traffic driver.
Two critical options that need to be set are the number of concurrent users (-c, default is 10) and the duration of the test in terms of either repetitive queries or time (-t). Let’s start with a simple example of 25 concurrent users for one minute.
$ siege -c25 -t1M www.example.com
** Siege 2.59
** Preparing 25 concurrent users for battle.
The server is now under siege...
Lifting the server siege... done.
Transactions: 406 hits
Availability: 99.75 %
Elapsed time: 59.66 secs
Data transferred: 10340918 bytes
Response time: 2.36 secs
Transaction rate: 6.81 trans/sec
Throughput: 173330.84 bytes/sec
Successful transactions: 412
Failed transactions: 1
Some of the other commonly used options for this would be –v (verbose) and –d to set a delay (the default is 0).
Understanding the results
The analysis that Siege leaves you with can tell you a lot about the sustainability of your code and server under duress. Obviously, availability is the most critical factor. Anything less than 100 percent means there's a user who may not be able to access your site. So, in the above case, there's some issue to be looked at, given that availability was only 99.75 percent on a sustained 25 concurrent, one-minute user siege.
Concurrency is measured as the time of each transaction (defined as the number of server hits including any possible authentication challenges) divided by the elapsed time. It tells us the average number of simultaneous connections. High concurrency may be a leading indicator that the server is struggling. The longer it takes the server to complete a transaction while it’s still opening sockets to deal with new traffic, the higher the concurrent traffic and the worse the server performance will be.
Advanced Siege techniques
When I first started using Siege, my method for stressing my server was a bit of hit and miss. I would just add concurrent users and extend the time index manually to see what results I'd get. The Siege package does include a tool called Bombardment, which incrementally increases the number of clients as a user option on the command line. However, I found that Bombardment just doesn’t work as well as it should because it lacks some of the user-defined options that Siege itself has, such as verbose output and logging.
Proper regression testing also involves more than just a single URL. Siege used to support a tool called Scout that "harvested" URLs from a target domain and included them in a file for Siege to test. (Scout is no longer being maintained, but you can still download it and use it.)
Siege now supports a new proxy tool called Sproxy, which harvests URLs for testing. The premise behind Sproxy doesn’t make much sense to me. You set it as a proxy for your browser and then all the URLs the browser visits are logged for Siege testing. Personally, I prefer Scout for getting my URLs, since it just goes through a site’s links and adds them that way. The only other option is manually inputting all the URLs.
Siege will stress any GET / POST, authentication, and/or cookie that you may have on your site. All you have to do is input the URL (or have Sproxy or Scout harvest it). Using the –i suffix on a multiple URL siege run will make the testing more lifelike by randomizing the URLs (that you’ve harvested or inputted) for siege by the concurrent threads.
Improving availability and response times
Siege is a straightforward tool for stress-testing URLs. The critical components that make up a particular URL’s availability and ability to withstand stress are made of networking elements and page-specific elements that Web developers can impact on. On the networking side of the equation, bandwidth (throughput capacity) and memory on the host Web server (or servers if they're sitting behind a load balancer) will make all the difference. Of course, the tighter your code, the smaller your file size (hopefully), which means less drain on the networking resources. Builder.com has some great resources on the ROI of using CSS, which should help a bit in the bandwidth area.
Use Siege to determine what your server can withstand. Your existing setup may just be enough to withstand the day-to-day siege that your everyday users lay on it.