Threading in programming languages is not a new idea. Having multiple points of execution is a concept that's fundamental to any multitasking environment. Java, however, requires developers to use explicit threading to accomplish tasks as basic as reading using nonblocking I/O. This reliance on threading for simple functions has developers working with Java threads while they’re still learning the language. Unfortunately, if Java is their first language, this can mean threading before they’re ready. Let’s look at some of the gotchas that can sneak in while using threads.
This is a quickie that everyone hits the first time they try to use wait. Listing A shows a wait call being made on an arbitrary object. The code compiles and will run fine right up until the wait call, at which point it will throw an IllegalMonitorStateException. Before calling the wait method on an object, you must first obtain that object’s monitor by enclosing the wait call in a synchronized block. An example of this is shown in Listing B. Fortunately, this gotcha is easy to catch; the exception gives it away. Most threading errors aren’t so nice and fail intermittently or deadlock silently. Let’s look at a few of those next.
Unsynchronized exit conditions
A common use of threads is for a looping dispatch or processor thread. These threads loop infinitely waiting for an external thread to alter the state of a mutually accessible object. However, if care is not taken to synchronize the object both threads are accessing, some subtle deadlocks can occur. In Listing C, you see a processor thread that loops until the done Boolean is set to true. It’s clear that the author of the code intends for some external thread to toggle the done flag to true and have the loop exit at the end of its current iteration.
But you might get tripped up with the propagation of the state of the Boolean variable, done, across thread boundaries. The Java memory model makes no guarantee that changes made to done in one thread are reflected in another. Thus, it is possible that the external thread could toggle the flag’s state, but the while loop in the other thread loops forever. Seasoned developers may now be saying that they’ve used code just like that seen in Listing C, and it works fine. They’re right; usually it works just fine. However, there’s no guarantee that it will always work as expected, and every experienced developer knows how Murphy's law applies to important events and intermittent failure.
The Java memory model requires that variable state be correct after a synchronization barrier has been encountered. This can be used to eliminate the risk of an infinitely looping processor thread, as shown in Listing D. The well-read Java developer might wonder whether the volatile keyword could be used to fix the same problem. The volatile keyword was created for just this sort of situation and was intended to force memory reads when accessing stale data in case a variable was updated in another thread. Unfortunately, the Java Language Specification was insufficiently detailed when addressing the workings of volatile, and many virtual machines ignore the keyword entirely.
Reading while writing
It took me a long time to believe this one was possible. You probably know that the Java collections aren’t synchronized at the method level. Therefore, synchronized methods on collections give a false sense of security with respect to concurrent modification of the data contained therein. You may also know that if you are going to have two threads modifying the same object, you need to make sure they’re not doing so at the same time. However, for some collections you must take care to make sure that even a read and a write don’t happen at the same time. It's possible that the modifications made to the internal data storage in the collection are not atomic in nature, and while the writing thread modifies the collection, the reading thread can peek in and get undefined results.
For instance, examine the code in Listing E. Since only one of the two threads is writing to the HashMap, it would seem at first glance that no synchronization is necessary. However, it is possible that during the put action on the HashMap, a get will find the HashMap in an inconsistent state.
Listing F shows the same code with a synchronized block added. Notice that the put and get can no longer happen at the same time.
When the data in two data sources must be kept cohesive in a multithreaded application, it's necessary to obtain the synchronization lock for both data store objects before making any changes. Care must be taken to ensure that all code seeking to hold both monitors simultaneously obtains them in the same order. If, as shown in Listing G, one thread nests its synchronization blocks in an order contrary to that of another thread vying for the same monitors, a deadlock can result. Each thread is stuck at the point indicated by the deadlock comment and is waiting for the release of the monitor that the other one already has.
A subtler version of this problem appears in Listing H. Remember that synchronized methods are just standard synchronized blocks around the whole method using the object locally referenced by the reserved word this for locking. Fortunately, once found, the repair of these deadlocks is as simple as reordering the synchronization for one of threads.
I would be remiss if I didn’t mention the class threading problem found in what’s been dubbed "the Double-Check Idiom." Bill Pugh has written an excellent article explaining this most common and counterintuitive threading problem.
Threading can be a tricky affair. But once you've made every mistake once, you can put aside any dread associated with multithreaded code and start using it more and more. Sufficiently threaded code can, under the right conditions, leave program execution time completely I/O bound, which is a great way to answer any charges about Java’s slowness.