Every ColdFusion developer should know about the cflock tag and its associated attributes. Unfortunately, many ColdFusion developers are a little cflock shy. In this article I will explain why you should make an effort to get to know this important tag and introduce you to the issues that make its existence necessary.
To understand the idea of locking, you must first understand the idea behind multithreading. Multithreading is the mechanism by which the ColdFusion server allows multiple requests for the same ColdFusion template to be processed at the same point in time. There is a potential problem however, and it has to do with data integrity.
The code in Listing A is designed to track the number of visits the page has received since the server was last started. The ideal place to store this data is in ColdFusion’s application scope. Keep in mind that all users of your application share data stored in the application scope. As you will see shortly, it is this sharing that can cause the headaches.
The first portion of code declares an application scoped variable, application.numberOfHits, if it does not yet exist. The next line simply increments this value each time the page is processed. Pretty simple, right? Things become a little more involved when you factor in two things—the multithreading behavior I just discussed and the manner in which data manipulation takes place behind the scenes. As it stands, the code in Listing A is not doing anything to protect itself from potential data corruption; it is unsafe code.
How cfset works behind the scenes
To understand why you need to get comfortable with what I will call "the three mini-steps of data manipulation," you need to know how cfset behaves and how this can lead to problems with data integrity. ColdFusion developers usually consider things such as variable assignments to be a one-step process; after all a simple cfset tag is all it takes. However, behind the scenes, things are a little more involved. The line below, from Listing A, is a typical cfset operation.
<cfset application.numberOfHits = application.numberOfHits + 1 >
It looks like a single step but in reality this is processed in three completely separate steps:
- Get current value of application.numberOfHits.
- Add 1 to value of application.numberOfHits.
- Save the new value of application.numberOfHits.
Herein lies the problem—because of multithreading, if two or more ColdFusion templates try to perform this variable assignment at the same time, they can cross paths and, to the untrained eye, cause very strange things to happen to your data.
Once the code in Listing A has been processed twice by the ColdFusion server, your intuition tells you the application.numberOfHits variable should reach a grand total of 2. It was originally 0, the page executed twice, thereby causing 1 to be added on each occasion. So the value 2 makes perfect sense.
However, it is possible that if the code in Listing A were processed twice, by two simultaneous visitors to your Web site, the grand total of the application.numberOfHits variable might only reach 1. It’s a little bit of a mindbender, but if you keep the "three mini-steps" from above in mind, you will spot the problem very quickly. Picture the situation: The first thread performs step 1, as the second thread performs step 2. In other words, the first thread just got the current value of 0 as the second thread incremented that value to 1. Because the first thread still has the value 0, it then also increments the value to 1. Leaving us with two page hits but an increment of just 1!
There are also other unsafe combinations of these steps. It’s only a lost page hit in this particular scenario, and your boss might not care too much. But when numbers represent dollars, for example, things can get pretty serious. So let’s learn how to avoid such a situation occurring in the first place.
All you need is the cflock tag and three of its attributes: scope, type, and timeout. This is easiest to explain by first fixing your unsafe example code. Have a quick look at Listing B.
The thing to notice here is the cflock tag surrounding the second cfset tag. What I've done here is to lock all other threads of execution out of this portion of code until the changes have been made. After the changes have been made, and only after, can another thread of execution enter this block and do its thing. Now let’s have a closer look at the type, scope, and timeout attributes.
How long do you wait?
The timeout attribute is simply a way of telling ColdFusion not to wait from now until eternity while the data-changing code within the cflock tag does its work. At some point, you might want to acknowledge that there has been a problem and allow ColdFusion to throw an error. A generally accepted value for this is 10 seconds, though you will not be ousted from the ColdFusion community should you dare to experiment with shorter or longer timeouts.
Which scopes need a lock?
The scope attribute in the example indicates that I have locked the ColdFusion application scope (scope = application). The only other value you usually would put here is session, which—you guessed it—locks the session scope.
Now why would you need to lock the session scope? Isn’t data in this scope private to each user of your application and therefore not susceptible to the problem of multiple users trying to access data simultaneously? Well, the multithreading problem is less likely to occur, but it's still a possibility. Consider a situation in which the user of your Web application is repeatedly hitting a submit button on an order form, sending multiple submissions at the same time! Frames-based Web sites and sites opened in multiple browser windows can also lead to a single user making multiple requests. So it’s a good idea to lock the session scope due to the fact that you can never be sure of exactly how a user may interact with your Web application.
How strong is your lock?
In my example, the cflock tag literally banned all access to the cfset tag until the closing cflock tag was reached. This was a good thing in this case, as I wanted the one thread to have an exclusive lock on this code. Sometimes, this is a little harsh. What if you had some code that merely wanted to read the value of the application.numberOfHits variable and had no interest in changing it? For example, the code below:
<cfoutput>Visits to this page so far: #application.numberOfHits #</cfoutput>
Do you really want to force all other threads to wait while you read the value of this variable? When you use readonly as a value to the type attribute, you effectively say, “Get the value of this variable, but wait until any data changing code has done its job.” All other users trying to read this value are not made to wait in line; they simply wait for any code that has an exclusive lock to finish doing its thing before they read the value. The difference is subtle, but important.
Listing C polishes off my example by adding a readonly lock to the cfoutput block that simply prints the current value of application.numberOfHits.
By the way, if you fail to provide a scope attribute you get "exclusive" by default. So it’s a good idea to put some thought into what kind of lock is needed. Automatically making everything an exclusive lock can get to be a real drag on performance.
Safer multithreading with cflock
Once you get the idea behind the cflock tag, using it is easy. The key is to learn to think like the multithreaded ColdFusion server does, and put the cflock tag into action when you need to. Certainly, crucial data stored in the application or session scope should be locked. Of course, it’s up to you to determine which data is crucial.