Apps

Microsoft's Parallel Extensions Library: Making multithreading easier

Microsoft's Parallel Extensions Library is a library of objects designed to speed the development of multithreaded applications. Justin James examines how the Parallel Extensions Library CTP compares with more traditional .NET multithreading.

Microsoft is working hard to release the Parallel Extensions Library, which is a library of objects designed to speed the development of multithreaded applications. The library contains two major advantages over traditional .NET multithreading techniques.

The Parallel Extensions Library encapsulates many existing concepts into prepackaged and tested components, reducing the likelihood of the individual developer making a mistake. The second advantage is that the developer can now use multithreading strategies that were previously considered "more work than they are worth" but still deliver benefits.

Here's a quick look at how the Parallel Extensions Library (as it stands in the December 2007 Community Technology Preview) compares with more traditional .NET multithreading.

Three groups of parallelism

The Parallel Extensions are divided into three main groups: declarative data parallelism (Parallel LINQ or PLINQ), imperative data parallelism (parallel versions of common programming patterns for working with data), and imperative task parallelism (systems for creating and running tasks).

PLINQ

PLINQ is designed to mimic the existing Language Integrated Query (LINQ) system, but it runs in parallel. In many cases, taking advantage of PLINQ requires very minimal changes to existing code. Since LINQ is still relatively new and has not penetrated far into existing code, it should be pretty easy for programmers to adapt to using PLINQ instead of LINQ where appropriate. The documentation makes me think this may not be ready for prime time yet, but it is a Community Technology Preview, after all.

Imperative data parallelism

The imperative data parallelism items introduce parallel versions of For, ForEach, and Do. For and ForEach execute the iterations in parallel. Do is designed to perform a list of statements asynchronously and in parallel; the Do call finishes when all tasks are completed.

It can be fairly easy to integrate the imperative tasks into many projects. Unfortunately, Visual Basic developers will have to do a little bit of extra work due to Visual Basic's lack of anonymous methods. Visual Basic developers will often find themselves putting the body of those loops in a separate function to use the parallel For and ForEach. The big difference between the two is that the new loops are a function call that you pass the details of the iterations (an IEnumerable<T> of objects to be iterated over or a start/stop number, and the action to be performed) rather than creating a block of code.

Imperative task parallelism

The task parallelism items allow you to create "tasks" that execute asynchronously. There are the basic Task objects, which immediately start to execute. There is also the Future object, which is a task designed to provide a value at some point after it is created; if the result is requested before it has completely, the call to the value blocks until it is finished.

There is also a TaskCoordinator class for managing multiple Task objects in groups. This is a radical shift from the current methods of performing multithreaded work. Instead of creating threads and attempting to manage them (I tend to use List<Thread> a lot), the TaskCoordinator makes it much easier to manage your threads and perform common tasks such as waiting for them to complete. Future<T> is great for things such as I/O or other high latency operations, where you can trigger it to occur long before you need the results.

The good, the bad, and the disappointing

The Parallel Extensions Library contains a ton of promise. One thing that I find very exciting is that it automatically scales the number of running threads as appropriate. However, like any parallel system, even if it only creates one running thread, there is a bit of a performance hit and overhead. But it's still nice to have many of the more common parallel situations handled by this library.

The system is still a Community Technology Preview, and it shows in a few places; for instance, the documentation is rather sparse. The Parallel Extensions Library also does not do anything about the problems of data concurrency; those are still up to the developer to handle. I hope the Tasks will alleviate much of this by allowing developers to deal with writing to shared memory in the main thread that handles the completion of tasks, rather than within the task itself. And it disappoints me that I was led to believe that there would be a fork/join in the Parallel Extensions Library, and I do not see it yet.

Overall, I am looking forward to working with the final release of the Parallel Extensions Library.

J.Ja

Disclosure of Justin's industry affiliations: Justin James has a working arrangement with Microsoft to write an article for MSDN Magazine.

About

Justin James is the Lead Architect for Conigent.

13 comments
aureolin
aureolin

Kicking off threads to do stuff for you was always the easy part. Handling data concurrency - that was the hard part. Basically all they did was stuff a competent programmer doesn't need. :-(

BALTHOR
BALTHOR

Digital chips can output a sine wave!Whatever I put in that program exe is run in that chip.There is no end to what digital can do given a good operating frequency.Digital is pulses of DC and my CPU is liquid cooled.The higher the frequency the smaller the bits,the better the resolution.I see digital as replacing the transistor.

BALTHOR
BALTHOR

The OS is recorded on the hard drive.Whet I start the computer the OS loads into the RAM.The BIOS identifies the various devices and makes them work with a mouse click.I could not even guess what the CPU does.If you think of a ROM chip device you will see that the exe starts when the device,like a microwave oven,is switched on.The exe in digital stands alone and needs no other processing assistance.Articles like this one just have to be more explanative in their explanations.(Don't make up words either!)

Justin James
Justin James

Does the Parallel Extensions library sound like it gives you enough of a leg up to start working with multithreading? Are you going to look at it now, or wait until the final version is released? J.Ja

Justin James
Justin James

I doubt that they took this idea from you in particular. For one thing, the current release is a December 2007 CTP; I beleive there was one before that as well. Also, there is really not too much "new" in here (and it really is not very similar to what we discussed, and your paper as I recall it), this is more like a library of common multithreading patterns that are already written and tested more than ground breaking work. J.Ja

Justin James
Justin James

Unfortunately, the vast majority of programmers are scared to death of even the easy part of multithreading. And yeah, they may have just covered the easy part, but in some instances here, "easy" is a relative term. Sure, writing a for loop that kicks off threads to process asyncronously is possible, but it is a bit of work to implement a throttling algorithm that allows it to effectively consume resources. What i like about this library is that it removes that kind of work from my end of things. I agree 100% that it would benefit significantly from having data concurrency stuff added too. J.Ja

viper777
viper777

You may want to refine what you meant by that! It's transistors, fets, gates which are basically transistors that create the square wave and sine waves in a circuit. You could use a crude oscillator like a relay buzzing wired up as such or create feedback in a path with still uses semiconductors in an unstable operation. Also the transistor is what does the logic switching in a typical Integrated Circuit (Chip). How many transistors does an Intel Pentium processor have? The first pentium prosessor had 3.1 million transistors!Did you mean Optical fibre linking between sections of a motherboard?

simon.whitear
simon.whitear

This is a good example of MS's application blocks, patterns and practices strategy putting a lot of power in as many hands as possible. But .Net multithreading has been a delight from 1.0 on ? the built in Async Programming Model and Threading namespaces meant any competent developer could be writing multithreaded apps within days of picking up the framework. Your article touched on the difficulties; they are in understanding the implications of parallel and async patterns on application and component design. MS would do well to package some guidance on debugging methods, data concurrency, synchronisation, deadlocks, race conditions et al. Without the fundamentals of multithreading many development teams will find successful implementations impossible.

Vladas Saulis
Vladas Saulis

I can agree that it could be difficult to see the correlations with my work on first sight. They've implemented it as a quick-n-dirty hack to my concept, and in the exact and only way which is possible for present set of technologies (like .NET). When I looked more closely to their reference work (dated as of October 2007), I've found too many correlations. Tasks and queues concepts, throw of each loop iterations, speaks of transactions, shared task memory, wait on completeness, my proposed load balancing solution. These all could be just a coinsidence taking separate. But when in combination, there is a very big chance these were inspired by my work. BTW, my work was first published on June 2007. Google reacted to my paper just in three days by offering me Sysadmin (!) position at them, which was not what I liked. And no reaction from Microsoft, as expected. I understand that I cannot have a real prove on use of my ideas. Resembling the ideas is very hard to prove. On the other hand I can see many positions (and specifically - combinations of them) for which I could tell this was not just a coinsidence.

Justin James
Justin James

Yeah, they are missing the hard stuff, as another commenter mentioned as well. I think where this library fits best are the little pieces or common pieces. Even for the stuff that has to deal with concurrency, deadlocks, etc., that can still be done with this library, but it would be a mistake for someone to think that they do not need to handle this stuff themselves with this library. J.Ja

Editor's Picks