Developer

Locking data and persisting CFCs in ColdFusion MX

Handling multiple execution threads simultaneously offers much higher performance than handling one thread at a time; however, they also have implications that must be carefully addressed by developers.

ColdFusion MX is a multithreaded application server. Handling multiple execution threads simultaneously offers much higher performance than handling one thread at a time. CFMX also allows developers to store data in server memory. This lets us cache application-wide data easily, or handle user-specific data like a shopping cart. Multithreaded execution and persistent data are useful and powerful features. However, they also have implications that must be carefully addressed by developers. In this article I will explain why you need to properly lock reads and writes to shared variables in ColdFusion MX. In addition, I'll explore some implications of storing instances of ColdFusion Components (CFCs) in shared scopes.


Printable PDF

This article and the corresponding example code are available in a printable PDF download version.


Why lock?

Most ColdFusion developers are aware of thread locking on some level, so I won't go into great detail explaining why you need to do it. But a tried and true example is something like this: Say you have some code that increments an application variable (for anyone who doesn't know, an application-scoped variable is a memory-resident variable that is shared between all users and threads of a specific CF application):

<cfset application.counter = application.counter + 1 />

Seems simple enough, but what if two users run the same page and the same time? The result can be corrupt data:

  • User A runs the page
  • User B runs the page
  • User A's thread reaches the counter and reads application.counter which is 25.
  • User B's thread reaches the counter and reads application.counter which is 25.
  • User A's thread increments the counter it read (25) and sets it to 26.
  • User B's thread increments the counter it read (25) and sets it to 26.

Clearly, the counter will end up being set to 26 (which is incorrect). To avoid this situation, called a race condition (because two threads are "racing" to see who gets to the data first), you must specify single-thread access to this piece of code using the <cflock> tag:

<cflock name="applicationCounterIncrementLock" timeout="10"
type="exclusive">
      <cfset application.counter = application.counter + 1 />
</cflock>

This will force only one thread at a time to execute that piece of code. So if User A's thread reaches this code first, it establishes an exclusive lock on the code until it has finished with it. If User B's thread gets here, it encounters the exclusive lock and will wait until that lock is released before it establishes its own exclusive lock on the code and performs its own increment of the application variable. The result is that User B's counter would be correctly set to 27.

There are two possible values for the "type" attribute of the <cflock> tag: "readonly" and "exclusive". A read-only lock allows as many threads as necessary to read the locked code, as long as there is not an exclusive lock on the same code. Conversely, an exclusive lock takes complete ownership of a piece of code, and will allow no other read-only or exclusive locks on the same code until it releases its lock. Typically, read-only locks are used when data is only being read from a shared scope. Exclusive locks are needed when data is being modified in a shared scope, as in our counter example above.

Scoped locks and named locks

Locks can be scoped locks or named locks. A scoped lock means that the lock affects an entire variable scope. It is particularly important to understand this when using "exclusive" locks. For example, if you create an exclusive session-scoped lock, this means the code is locked for all threads from the same user. Scoped locks on the session scope are acceptable because they are only single-threading access to the code for a single user.

You can also create scoped locks on the application and server scopes, but this is highly inadvisable. Creating an "application" scope lock means that all threads for the entire application are affected by the lock. Even worse, a "server" scope lock means that all threads for the entire server are affected. This can result in massive bottlenecks as all threads wait for exclusive locks to be released.

For application and server-scoped variables, a named lock is more appropriate. In the counter example above, you can see I used an exclusive, named lock. This is much better than a scoped lock, because it only single-threads access to other threads trying to access the same named lock. Thus, we don't single-thread access for an entire application or server, but just for reads or writes to the section of code contained within the named lock. So the basic rule of thumb is to use session-scoped locks for session variables and named locks for application and server variables. However, use locks on the application and server scopes with caution.

Shared scopes and CFCs

Most intermediate or advanced CF developers already understand the previous information. But where many CF'ers seem to be uncertain relates to shared scopes and instances of ColdFusion Components (CFC).

CFCs are object-like constructs, and they can be instantiated into any of the shared variable scopes. This has opened a whole new world to CF developers, but also raises new questions about where and when to lock code. In particular, confusion seems to center around the difference between storing a stateless CFC vs. a CFC which is stateful.

A stateless CFC is one which has no internal instance data, or one which has instance data that is updated very rarely. In other words, its internal state does not change once it is created. Within CFCs, the "variables" scope represents instance data, so in practice, this means that a stateless CFC does not update any internal data in the variables scope. The "this" scope can also contain instance data, but because it is public it is often avoided.

A stateful CFC, on the other hand, contains instance data which changes. Its internal state is in flux, and data within its variables scope is modified.

The difference between these is very important, as it dictates the locking approach you must adopt.

Consider the following scenario involving a stateless CFC. A developer is building an application that must display current company news. He or she might wish to create a CFC to encapsulate data and behavior related to news items. Further, to improve performance this CFC could be instantiated into the application scope:

<cfset application.newsManager = 
createObject( 'component', 'newsManager' ).init() />

Now, instead of creating an instance of the newsManager component every time the news data is needed, the developer can just call methods on the application scoped CFC instance. For those familiar with OO design patterns, this is an implementation of the Singleton pattern.

Locking it

Sounds great; but wait, don't we have to fit locking in here somewhere, since we're using the application scope? Absolutely. However, with care we can minimize the amount of locking necessary. Let's think about this.

A single newsManager instance will be shared between all users of the entire application (as defined by the name used in the <cfapplication> tag). So it only needs to be created once, the first time any user runs the application. Thus, we only need one exclusive named lock on this code, and it must run on application startup:

<cfif not structKeyExists( application, 'newsManager' )>
    <cflock name="loadNewsManagerLock" timeout="10" type="exclusive">
            <cfif not structKeyExists( application, 'newsManager' )>
                  <cfset application.newsManager =
createObject( 'component', 'newsManager' ).init() />
            </cfif>
      </cflock>
</cfif>

Note that by design, application.newsManager will only be created one time during the lifetime of an application. Once it is created, its internal state will not change. This has an important consequence: it means we don't need to lock subsequent reads of application.newsManager. There is no possibility of a race condition. This is because application.newsManager is a stateless CFC that has no changing instance data—in fact it has no instance data at all. (Note that with a framework like Mach-II, the framework itself manages the creation of application-scoped listener components.)

This minimizes the amount of locking we must do, but it also means we must be careful to adhere to our design decision that newsManager is a stateless CFC. We must ensure that no other developers accidentally recreate application.newsManager without proper locking. But more subtly, it means that the application.newsManager CFC should not contain internal instance data that is modified. Why? Since the newsManager CFC instance exists in the application scope, all instance data it contains also exists in the application scope.

That means no data inside the CFC should be written to the "variables" scope (or the "this" scope). Because our newsManager CFC has no instance data, we can safely avoid the need to lock method calls to this CFC instance. Take a look at the newsManager code in Listing A to see for yourself.

Example one

From Listing A, note that in the method getRecentNewsItems() I am careful to define my query variable "qRecentNewsItems" in the "var" scope. This tells CF that qRecentNewsItems is a method-local variable. It will only exist for the duration of the method call and is not instance data.

I cannot stress enough the importance of properly using the var scope within CFC methods! If I had not used the var scope for the query variable, CF would default to placing it in the "variables" scope. This would be instance data, and our CFC would then be exposed to potential race conditions if multiple threads call it simultaneously. Use the var scope for all method-local variables and avoid this subtle danger! Failure to do so will result in problems when the application is placed under load.

Example two

In the newsManager example we avoid the need for read-only locks on the application scope because we have made the architectural decision that newsManager would be stateless. But what about a situation where we know we will have changing data within a CFC that is held in a shared scope? It means that either your calls to the CFC must be properly locked, or the CFC itself must take responsibility for locking internally.

As I have said, data stored in the "this" scope and the "variables" scope is instance data for that CFC. But most people don't think about the need to lock these scopes when a CFC is stored in a shared scope. So let's make it very obvious. I'll skip over the "this" and "variables" scopes for a moment and consider a CFC that manipulates data stored in the session scope. Remember that a session variable is a shared-scope variable that is specific to one user. A shopping cart is an example of this.

We can create an application-scoped cartManager just as we did the newsManager:

<cfif not structKeyExists( application, 'cartManager' )>
      <cflock name="loadCartManagerLock" timeout="10" type="exclusive">
            <cfif not structKeyExists( application, 'cartManager' )>
                  <cfset application.cartManager =
createObject( 'component', 'cartManager' ).init() />
            </cfif>
      </cflock>
</cfif>

But from here, things are handled differently than our newsManager CFC. The cartManager works as a session façade that manages interactions with a Cart CFC that we will hold in the session scope. Remember that session variables are specific to each user. Because the cartManager manipulates a Cart CFC instance in the session scope, the reads and writes to the Cart CFC should be properly locked to avoid race conditions. Such a situation could arise if the user is using the application in multiple browser windows. You can see the cartManager code in Listing B, and the Cart code is in Listing C.

Note that the Cart CFC itself has no locks and doesn't care or even know what scope it is being used in. The cartManager takes on these responsibilities. This is a good design practice that reduces coupling between your business object (the Cart) and the rest of the application. To avoid any possible race conditions, the cartManager handles locks for reads and writes to data in the session scope.

Hopefully it is clear that because our manager CFC is manipulating a variable in the session scope, we should incorporate locking. But remember that this is also true if a shared-scope CFC holds modifiable instance data in the "this" or "variables" scopes. Remember that if the CFC instance is kept in a shared scope, all of its instance data is also in the shared scope!

A clearer picture

You can see code that runs and tests everything we've done so far in Listing D. I hope that this article has helped you build a clearer picture of locking in ColdFusion MX. And in particular, I hope it has shed some light on proper ways to deal with locking within shared CFC instances. Please post to the forums or contact me if you have any other ideas or questions. Thanks to Sean Corfield for his input regarding this article.

Editor's Picks