Software Development

How do I... Use threading to increase performance in C#? (Part 1)

Bad threading logic can actually slow down an application, or worse, cause an application to have inconsistent exceptions. There really isn't much to be worried about though if you plan ahead and take the correct precautions. Zach Smith shows you some examples on how to use threading in C# and how to protect yourself while using threading.

Threading is commonly used by developers to increase the performance of applications. However, if used incorrectly threading can have the exact opposite effect. Bad threading logic can actually slow down an application, or worse, cause an application to have inconsistent exceptions. There really isn't much to be worried about though if you plan ahead and take the correct precautions.

This blog post is also available in PDF form in a TechRepublic download, which includes an example Visual Studio Project application showing how these techniques work.

About the ThreadPool

In C#, the ThreadPool is basically a collection of threads that you can use to run asynchronous tasks. For instance, if you had to call a database with ten different independent queries, you could post the calls to the ThreadPool and the calls would be executed asynchronously. The ThreadPool handles pretty much everything for you -- the application developer just provides a method to execute and the ThreadPool executes it when a thread becomes available.

Although the ThreadPool is very easy to use, there are some tips and tricks that are helpful to know when dealing with it.

The flow of using the ThreadPool looks something like this:

  • You post a method to be executed by the ThreadPool
  • The ThreadPool waits for an idle thread
  • When an idle thread is found the ThreadPool uses it to execute your method
  • Your method executes and terminates
  • The ThreadPool returns the thread and makes it available for other processing

There are some limits to the ThreadPool and you should be aware that it won't be the optimal approach to all of your threading needs. Please read the section below titled "ThreadPool Limits" for more information on this.

Using the ThreadPool

To use the ThreadPool and instruct it to queue up a task you will call the ThreadPool.QueueUserWorkItem method. This method accepts a method name and will then execute the given method when a ThreadPool thread becomes available. A simple example of this is shown in Figure A. (The download version contains the actual code for all figures.)

Figure A

Simple threading

As you can see this example simply queues up 10 items, passing the value of "id" to each item. The items then print out some data to the console. Note that since these are executing asynchronously they won't necessarily print out in order. The execution order is not guaranteed, even if you queue the items up in order.

Another note to make about this code is that we're passing a variable of type int into each thread instance. This works because ints are value types -- if we were to use a reference type in this same situation then we could get overlap in our threads and get inconsistent results.

Communicating from the ThreadPool

There are many times when you'll need to know when a thread has completed, or get some other information about the status of a thread. There are several ways to do this, but one of the easiest and most direct is to use events within the threads and subscribe to those events from your application. The code for this is shown Figure B.

Figure B

Object threads

In this example we're creating ten objects, subscribing to the OnWorkComplete event, and adding the objects to a List<T> object. After the worker classes have been created we then loop through the list and call the Start() method on them.

The Start() method uses the ThreadPool internally, so the foreach statement doesn't actually wait for one object to complete its work before calling Start() on the next object in the list. The code for the WorkerClass class is shown in Figure C.

Figure C

Worker class

This class simply uses the ThreadPool.QueueUserWorkItem method to queue up the DoWork method, which has access to all of the class instance's members. The DoWork method then sleeps for one second to simulate work being done and then fires off the OnWorkCompleted event.

Using this type of setup encapsulates your threads and allows you to have a more granular control over them.

Multiple threads concurrently accessing the same variable

An issue you are bound to run into eventually is having more than one thread accessing a variable at the same time. A good example of this is shown in Figure D.

Figure D

No lock

In this code we have ten threads all accessing the "data" variable which is of type Dictionary. At first glance this code looks ok. We're doing what we should be doing by checking to make sure the data variable doesn't contain a certain key before adding it. In a normal, synchronous, application this code would work without a hitch. However, when we introduce threading this code will break.

The reason is that even though we're making sure the key doesn't exist, other threads are accessing the same variable at roughly the same time. So it is possible for two threads to check that the key doesn't exist, both pass, and then after one adds the key the other cannot and an exception will be thrown. The Thread.Sleep call pretty much guarantees that we'll get an exception.

To fix this type of issue we need to tell the runtime that the variable should only be accessed by one thread at a time. This is accomplished by using the lock() statement, which is shown Figure E.

Figure E

Using lock

The only major difference between this code and the code shown in Figure D is the lock() statement. The lock statement locks the variable passed into it and makes sure that only one thread is accessing the variable at a time. This ensures that once we check to make sure the key doesn't exist, no other threads will be able to add the key before we add it.

It is extremely important to use this type of functionality when threading is in use. The reason is that many times the errors won't show up during development (due to volume), and could make it all the way to a production environment before being caught.

ThreadPool limits

The downfall of the ThreadPool is that the number of threads is finite and defaults to 25 threads per available processor. This means that if you queue up 30 tasks, the last five will have to wait for threads to become available from the pool before being executed. To get past the thread count restriction Microsoft has provided developers with a way to overwrite this number. You simply call the SetMaxThreads method and pass the number of threads you would like to have available.

A similar method is SetMinThreads. The ThreadPool doesn't really have all of the threads sitting idle and waiting for tasks -- it only creates the number of threads that you request up to the value given to SetMaxThreads. So, if you expect to have the need for, say, 30 threads then you'll want to use SetMinThreads to make the minimum number of threads 30.

This will increase performance since the ThreadPool won't immediately create new threads when needed; it only does this on certain intervals. By default this interval is half of a second, so the creation of a thread (and delay in your application processing) can be up to half a second even if you haven't reached the maximum thread threshold.

In Part 2

In Part 2 of this series I will show you a couple of advanced methods to keep track of your threads and show how to handle thread exceptions. Stay tuned!

9 comments
ozrich2
ozrich2

Hi, great article! I just have a little question,in figure B there is a list object called "Workers". Ii isn't defined anywhere and i havn't managed to define it myself and i can't make the script work. Can someone please explain to me how to define the List "Workers" correctly? Thank you in advance.

verelse
verelse

As Tony said, might be good for printing data once you have a safe copy in the thread and that is what I am doing with this. I've not had any need for threading so far, did the usual "test" apps, but now I have multiple, large data set exports "printing" to PDF, Excel and other formats on a web server. I am threading these out so they do not block the web page whose sole purpose is to receive the job, log it, then fire a COM+ event which kicks off the correct print job. None of that belongs in ASP.Net, hence the desire to move it out. Thanks for the article.

Tony Hopkinson
Tony Hopkinson

A couple of points to bear in mind when adapting an example like this. Don't do this Lock(Something) { Recreate the entire universe() } Keep the length of time you have a lock on as short as possible and lock at the lowest 'level' for the operation. For instance if you were changing the value of key more than you were adding it, it might be better to have one thread for managing the keys in the dictionary and a pool for dealing with value. Seeing as adding the key is effectively sequential, threading that part of it simply leaves you with a big pile of blocked threads. But lock dictionary in one and value in the other could be a better solution. Don't forget the other side of this equation as well, what about deleting key's from the dictionary? I really wanted to make the point about lock though, far too often I see code with locks all over it, and some numpty with a big smile, patting themself on the back and adding multi-threading to their resume. The other thing to bear in mind, is there is no reason for threading if the application can't usefully do something else while the thread is parked. Setting printing off for instance, is a reasonable idea for a thread, given that you've passed a copy of the data to be printed to the thread so you aren't changing the data while it's being printed!. A sequential operation Do This Do That when do that relies on do this being completed is not two threads, it might be one. Don't do threading for a laugh, it's too much work all round to be that funny.

BALTHOR
BALTHOR

If you have ever seen a large print out of a computer program you've seen script.A script interpreter program is used to print out the program.This is not gigantic intellect this is a script interpreter program.Programming or software writing started with ROM flashing.Whoever invented the ROM chip and ROM flashing invented a way to write the program.

zs_box
zs_box

Glad you liked the article! Hopefully Part 2 will be useful for you as well. -Zach

Justin James
Justin James

All spot on. Especially: "Don't do threading for a laugh, it's too much work all round to be that funny." This article should also not be taken as the only approach in .Net. For a single task that supports things like a clean cancellation and status updates, the BackgroundWorker class is a great choice. And for me, I do not use ThreadPool at all, I require much more find grained control over my threading, and I find that for what I do (typically spiking CPU to 100% for a touch computation), the ThreadPool will kill me, since it will gladly hand me a few dozen threads, which will then all be in contention. I really need to write my own thread management object that automatically throttles and such the threads based upon resource consumption... J.Ja

zs_box
zs_box

You've peaked my interest with the thread management code - especially based on processor usage. I'd love to see that if you don't mind. Also, part 2 of this article will discuss some options on how to manage threadpool threads. But you are correct, there are times when the threadpool is NOT the best choice when doing threading in .NET.

zs_box
zs_box

Sounds good to me. Thread management is always a touchy thing to do, so any components to help with that would be great. The resource based throttling is particularly interesting. See ya, Zach

Justin James
Justin James

Zach - I'd be glad to get a copy of it over to you, once I get it done. It is still in the "thinking about it" stage. :) J.Ja