Maintaining one or more request queues at the front end of an application server farm can greatly improve performance. In this article, I outline the benefits, drawbacks, and implementation concerns of some of these designs, including single, dual, and multiple queues.

Single-queue implementation
Arguably, the easiest approach to request-queue design is to implement a single queue. This design gives you only one queue to manage, and records do not have to be moved back and forth between pending and active queues as they are assigned to servers.

With this design, each record must store the request data, the status of the request, and an identifier for the server to which it was assigned. The latter is required for requeue in case of a server failure. If Pending and Active are the only states a request record may obtain, you may combine the status indicator and designated server fields. If the record is pending, the field value for the designated server is blank or null; otherwise, the record is active.

The price for this simplicity is additional complexity in the logic that determines which request to assign to the next available server. With both pending and active requests maintained in the same queue, the logic must traverse the queue, checking the record status indicator until it finds a pending record. Performing this search for each request assigned to a server may impose quite a performance hit, particularly if the queue is large and many requests are active simultaneously.

Additionally, upon server failure, you must search the entire queue, including both pending and active records, for each request assigned to the failed server. Once the requests are found, requeue is performed by simply clearing the designated server field and resetting the request status to Pending.

Dual-queue implementation
Of course, the most logical way to separate records in a request queue is to split pending requests from active requests. A two-queue design is the simplest implementation of this scheme.

This design does not require the request status field.  A request is in either the pending queue or the active queue. You must still maintain the designated server identifier, but only within the active queue, which means there’s less data to be stored.

You’ll encounter a small amount of increased overhead in maintaining two queues, but it shouldn’t add much complexity to your overall system. The primary benefit of a dual-queue implementation is that the logic for determining the next record to assign consists of merely retrieving the next request from the pending queue; no search is required. Since the client-server interface will perform this action most often, performance is markedly improved.

If a server fails, you must still perform a search for that server’s active requests, but the search will include only the records in the active queue. This generates a slight performance benefit over searching all records.

However, once each record is found, it must be transferred from the active queue to the pending queue. This operation is more performance-intensive than simply modifying the status field of a record, and it is further complicated by the fact that the formats of the records in the active and pending queues are not identical. If these queues are maintained as database tables, moving from one to another involves a deletion from one table and an insertion into another.

Multiple-queue implementation
Another alternative is the multiple-queue scenario, which requires a single pending queue and an active queue for each application server. This design yields the greatest number of queues to maintain, but it simplifies other aspects of the logic.

With separate queues for each server, no additional data needs to be stored with the request record. Any record assigned to a server’s queue is active and being handled by that server. In case of server failure, that server’s queue, in its entirety, is added to the pending queue. No search is required, and because no other data is stored, the record format should be identical between the queues. If the formats are different, the record has to be read into one structure from the table where the deletion will occur, and the pertinent elements must be transferred to another structure for insertion into the other table.

Like the dual-queue implementation, the logic for determining the next record to assign consists of merely retrieving the next request from the pending queue. Again, no search is required.

The primary drawback to this method stems from maintaining all of the queues: (n + 1) where n is the number of application servers. If the queues are implemented as database tables, this also means the database schema must be altered each time a server is added to the system. This factor will not endear developers to their database administrators. Additionally, requeue of requests requires the transfer of records from their server queue to the pending queue. Operationally, this task requires more performance overhead than simply modifying status fields in a single queue.

You must also consider certain timing issues in worst-case scenarios. If the queues are implemented as database tables and a record is deleted from an active server queue and the system crashes before the record is added to the pending queue, the request is lost.  However, if the record is added to the pending queue first and the system crashes before it is deleted from its server’s queue, the record may be redundantly added to the pending queue on startup.

Other benefits of establishing request queues
The primary reason for implementing a request queue design is to prevent server failure if more requests are received than the machine can handle, but there are other features that can easily be added that improve system monitoring. For example, other applications may utilize data generated from the request queues. This information could be used to analyze the performance of the front-end interface and individual servers as well as the system as a whole.