Developer

Microsoft's Parallel Extensions Library: Making multithreading easier

Microsoft's Parallel Extensions Library is a library of objects designed to speed the development of multithreaded applications. Justin James examines how the Parallel Extensions Library CTP compares with more traditional .NET multithreading.

Microsoft is working hard to release the Parallel Extensions Library, which is a library of objects designed to speed the development of multithreaded applications. The library contains two major advantages over traditional .NET multithreading techniques.

The Parallel Extensions Library encapsulates many existing concepts into prepackaged and tested components, reducing the likelihood of the individual developer making a mistake. The second advantage is that the developer can now use multithreading strategies that were previously considered "more work than they are worth" but still deliver benefits.

Here's a quick look at how the Parallel Extensions Library (as it stands in the December 2007 Community Technology Preview) compares with more traditional .NET multithreading.

Three groups of parallelism

The Parallel Extensions are divided into three main groups: declarative data parallelism (Parallel LINQ or PLINQ), imperative data parallelism (parallel versions of common programming patterns for working with data), and imperative task parallelism (systems for creating and running tasks).

PLINQ

PLINQ is designed to mimic the existing Language Integrated Query (LINQ) system, but it runs in parallel. In many cases, taking advantage of PLINQ requires very minimal changes to existing code. Since LINQ is still relatively new and has not penetrated far into existing code, it should be pretty easy for programmers to adapt to using PLINQ instead of LINQ where appropriate. The documentation makes me think this may not be ready for prime time yet, but it is a Community Technology Preview, after all.

Imperative data parallelism

The imperative data parallelism items introduce parallel versions of For, ForEach, and Do. For and ForEach execute the iterations in parallel. Do is designed to perform a list of statements asynchronously and in parallel; the Do call finishes when all tasks are completed.

It can be fairly easy to integrate the imperative tasks into many projects. Unfortunately, Visual Basic developers will have to do a little bit of extra work due to Visual Basic's lack of anonymous methods. Visual Basic developers will often find themselves putting the body of those loops in a separate function to use the parallel For and ForEach. The big difference between the two is that the new loops are a function call that you pass the details of the iterations (an IEnumerable<T> of objects to be iterated over or a start/stop number, and the action to be performed) rather than creating a block of code.

Imperative task parallelism

The task parallelism items allow you to create "tasks" that execute asynchronously. There are the basic Task objects, which immediately start to execute. There is also the Future object, which is a task designed to provide a value at some point after it is created; if the result is requested before it has completely, the call to the value blocks until it is finished.

There is also a TaskCoordinator class for managing multiple Task objects in groups. This is a radical shift from the current methods of performing multithreaded work. Instead of creating threads and attempting to manage them (I tend to use List<Thread> a lot), the TaskCoordinator makes it much easier to manage your threads and perform common tasks such as waiting for them to complete. Future<T> is great for things such as I/O or other high latency operations, where you can trigger it to occur long before you need the results.

The good, the bad, and the disappointing

The Parallel Extensions Library contains a ton of promise. One thing that I find very exciting is that it automatically scales the number of running threads as appropriate. However, like any parallel system, even if it only creates one running thread, there is a bit of a performance hit and overhead. But it's still nice to have many of the more common parallel situations handled by this library.

The system is still a Community Technology Preview, and it shows in a few places; for instance, the documentation is rather sparse. The Parallel Extensions Library also does not do anything about the problems of data concurrency; those are still up to the developer to handle. I hope the Tasks will alleviate much of this by allowing developers to deal with writing to shared memory in the main thread that handles the completion of tasks, rather than within the task itself. And it disappoints me that I was led to believe that there would be a fork/join in the Parallel Extensions Library, and I do not see it yet.

Overall, I am looking forward to working with the final release of the Parallel Extensions Library.

J.Ja

Disclosure of Justin's industry affiliations: Justin James has a working arrangement with Microsoft to write an article for MSDN Magazine.

About

Justin James is the Lead Architect for Conigent.

Editor's Picks