Developer

How to use the Parallel class for simple multithreading

In this programming tutorial about Parallel Extensions in .NET 4, Justin James focuses on imperative parallelism as presented by the Parallel class.

Multithreaded application development is a topic I have written about on TechRepublic a number of times. I published a seven-part series on how to write multithreaded code in VB.NET, and I discussed the Parallel Extensions (which shipped with .NET 4), but never with any code samples for whatever reason.

At the request of a reader who asked for some more up-to-date multithreading code, in this column I will highlight some features of Parallel Extensions in .NET 4 with hands-on code. (The code is from my presentation on Parallel Extensions that I've given in the Carolinas the last two years.) There is a lot to cover, so in this column, I will focus on what is called imperative parallelism as presented by the Parallel class (which is part of the System.Threading.Tasks namespace).

The Parallel class

The static Parallel class contains three very useful methods: For, ForEach, and Invoke. For and ForEach operate upon an Action object at their heart; Invoke works on an array of Actions. For and ForEach mimic the functionality of the loops for which they are named.

  • Parallel.For accepts a start boundary, an end boundary, and an Action<int> as arguments. The Action will be called once for each number between the start and the end numbers, and each of those numbers will be passed into the Action as its argument.
  • Parallel.ForEach works with an IEnumerable<T> and an Action<T> (where T is the same for both), and calls the Action once for each item in the IEnumerable, passing in that item to the Action.
  • Parallel.Invoke is a little less complex; it simply calls each Action in the array once.

For all three of these methods, the order of execution is not guaranteed. It may be completely random, or in order, or partially in order. If your code requires that it be run in a particular order, it is not a good candidate for parallel operation. Let's look at Parallel.For first.

A normal for loop usually looks something like this:

static void SequentialGeneration(int startNumber, int endNumber)

{

int iteration = -1;

DateTime startTime = DateTime.Now;

for (int counter = startNumber; counter <= endNumber; counter++)

{

iteration++;

System.Console.WriteLine("Iteration " + iteration + ": BEGIN");

long fibValue = Fibonacci(counter);

System.Console.WriteLine("Iteration " + iteration + ": " + fibValue);

System.Console.WriteLine("Time since execution began: " + new TimeSpan(DateTime.Now.Ticks - startTime.Ticks).TotalSeconds);

}

}

This loop is going to spit out the Fibonacci numbers for each item in the range of startNumber to EndNumber. Typical enough, right? Well, if we wanted to do this with the traditional threading model, we would need to do a lot of work to make it happen relative to the contents of the loop. We would need to create threads, start them with a delegate to a function, and maybe add in code using a Semaphore to keep our active thread count limited to the number of logical CPU cores in the system.

In another example I use that is nearly identical, the code around this loop goes from 18 LOC (including whitespace and braces) to a whopping 79 LOC, which involves two functions and a class with five properties, a function, and a constructor. That's a massive amount of code bloat! Even worse, because of the use of a delegate, the code is completely abstract and indirect — it is very difficult to trace the execution of a piece of code with its caller. Every person I have talked to who has worked within this model really does not like it. With Parallel.For, we are only going to go to 24 LOC with no substantial increase in complexity. Here's what the code looks like:

static void ParallelGeneration(int startNumber, int endNumber)

{

object lockObject = new object(); //Needed for atomic operations on iIteration

int iteration = -1;

DateTime startTime = DateTime.Now;

Action<int> forLoop = counter => //Begin definition of forLoop

{

lock (lockObject) //Lock iIteration

{

iteration++;

}

System.Console.WriteLine("Iteration " + iteration + ": BEGIN");

long fibValue = Fibonacci(counter);

System.Console.WriteLine("Iteration " + iteration + ": " + fibValue);

System.Console.WriteLine("Time since execution began: " + new TimeSpan(DateTime.Now.Ticks - startTime.Ticks).TotalSeconds);

}; //End definition of forLoop

Parallel.For(startNumber, endNumber++, forLoop);

}

You might not be familiar with how Actions work or the locking that is happening, so we'll walk through the changes.

  • Introduction of the lockObject variable: This is needed because we want to make sure that only one thread is updating the value of iteration at a time to prevent what is called a race condition. Race conditions are rare, but they lead to data corruption.
  • The declaration for forLoop: Where we originally had the for statement, we now declare an Action<int>. Notice the use of the lambda syntax, they are the same thing. All we did was use the contents of the for loop as the body of the Action, and end that block with a semicolon since that is legally the end of the declaration statement.
  • The lock block: We wrapped the code that updates the iteration value in the lock block (SyncLock in VB.NET). The lockObject is the "key" to the lock. Two lock blocks using different keys can execute simultaneously, but if multiple blocks using the same key try executing at the same time, they go one at a time. This prevents the potential race condition around the iteration++ statement. A good rule of thumb is to use lock every time you want to update a variable. If you need more granular control over the release of the lock, use Monitor.Enter() and Monitor.Exit() instead.
  • The calling of Parallel.For(): Instead of having a for statement, we pass the start number, end number, and the Action<int> to Parallel.For().

As you can see, it required very little effort to convert our existing for loop to use Parallel.For(). If you are using .NET 4, you can do this right now and see nearly effortless performance gains with existing and new applications.

Parallel.ForEach is even easier. Here is the original code, which creates a list of numbers in order and then prints them to the screen:

static void SequentialGeneration(int startNumber, int endNumber)

{

var numbers = new List<int>();

for (var counter = startNumber; counter <= endNumber; counter++)

{

numbers.Add(counter);

}

foreach (var number in numbers)

{

Console.WriteLine(number);

}

}

And here is a version that runs in parallel:

static void ParallelGeneration(int startNumber, int endNumber)

{

var numbers = new List<int>();

for (var counter = startNumber; counter <= endNumber; counter++)

{

numbers.Add(counter);

}

Action<int> forEachLoop = number => //Begin definition of forLoop

{

Console.WriteLine(number);

};

Parallel.ForEach(numbers, forEachLoop);

}

This is about as easy as it gets, folks!

Parallel.Invoke() is a slightly more confusing idea, but just as easy in practice. Parallel.For and Parallel.ForEach are used when you have the exact same code that needs to be called multiple times with different input parameters. Parallel.Invoke is more useful for calling different pieces of code at the same time. For example, if you are reading data from a network location, calling a Web Service, and drawing something on the screen, this is a good way to do them all at once and save time. All you do is create an array of Action<> objects and pass the array to Parallel.Invoke():

static void InvokeExample()

{

Action action1 = () =>

{

var rng = new Random();

var endNumber = rng.Next(500);

for (var counter = 0; counter <= endNumber; counter++)

{

Console.WriteLine(counter);

}

};

Action action2 = () =>

{

var client = new WebClient();

var data = client.DownloadString("http://www.techrepublic.com");

// Store data somewhere

};

Action[] actions = { action1, action2 };

Parallel.Invoke(actions);

}

This will download from the URL, while simultaneously printing to the screen. It is a trivial example, but I think it is clear how this can be very useful in daily programming. Again, with only a light amount of effort to wrap the code into Action objects, we are getting parallel operation. Not bad for a few minutes' worth of work!

J.Ja

Disclosure of Justin's industry affiliations: Justin James has a contract with Spiceworks to write product buying guides; he has a contract with OpenAmplify, which is owned by Hapax, to write a series of blogs, tutorials, and articles; and he has a contract with OutSystems to write articles, sample code, etc.

About

Justin James is the Lead Architect for Conigent.

Editor's Picks