One of the recent pieces of technology capturing my interest is Microsoft's Parallel Extensions Library. While I do not think that every application is a good candidate to use concurrency, the Parallel Extensions Library is a good tool for those applications that can make use of it. In a nutshell, the library takes most of the work out of running pieces of code in parallel.
I have used the Parallel Extensions Library for many pieces of demonstration code (it is still in CTP, so I won't use it in production code yet) with good results. Here's a look at my experiences converting an existing application to make use of the library.
Using the library
About a year ago, I took it upon myself to write a "photomosaic" application in VB.NET. I thought it would be a fun application to write, since I enjoy writing code that manipulates graphics, and because I thought that it would be a good opportunity to get some parallel processing practice. It was a great learning experience. The first compile of it that worked as expected took about two days to run and consumed 100% of one CPU core the entire time, while also reserving about 1 GB of RAM for itself (yes, it held 1 GB of RAM for two days). I then embarked upon a major overhaul of the code to take it from working to optimized. Along the way, I added the anticipated parallel processing.
The bulk of the overhaul was pretty quick; there was a lot of "low hanging fruit" that got plucked. The execution speed dropped from days to hours, and I was able to bring the RAM usage down to a much more acceptable number. There was no way of dodging the 1 GB RAM number, but I was able to get it to only happen at the end of the processing for a minute or two rather than the entire time. Adding in the parallel processing was a royal pain in the neck, though. To make it worse, I was using the BackgroundWorker object to run the processing asynchronously from the main application thread in order to allow cancellation and progress updates without making the application feel hung.
I ended up with some Frankenstein's Monster of code. Half of my processing routine had to do with thread management. To be honest, I think that .NET's ThreadPool is fine for lightweight work, but I do not trust it for heavy processing. So I put together a tangle of code; it maintained a List<Thread> for the running threads, and for each pixel to be processed, it would loop over the list looking for finished threads until at least one had completed. To support cancellation, that entire disaster was running in a thread, while the thread that called it (which was in the BackgroundWorker object) would sleep for half a second, check the BackgroundWorker to see if cancellation was requested, and if it had not been, go back to sleep. One thing I have learned in this industry is that, if a system is hard to explain and even harder to understand, it is probably too complex to work well. Over all, this system worked but hardly well. It was still a bit pokey, and debugging/maintaining the code was a nightmare. Heck, getting it to work the first time around was a nightmare!
When I sat down to convert it to use the Parallel Extensions Library, I also rewrote the main processing code in C#. There was no overwhelming reason for this, other than the fact that I've used lambda expressions in C# and not in VB.NET, so I felt comfortable with that language for this task. Since the Parallel Extensions Library relies upon passing lambda expressions to various methods, this was a logical jump for me. At the same time, I was also taking a Visual Studio 2005 application to Visual Studio 2008. Somewhere along the way on the first try, I had managed to introduce a bug that I just could not conquer. I reconverted the project to Visual Studio 2008 and reorganized it a bit (introducing C# made me split the project into three projects, to avoid circular dependency loops), and the problem went away.
At that point, I did the actual conversion from my home-brewed threading model to the Parallel Extensions Library. The bulk of the effort was simply cutting out chunks of code. I significantly cut the line count for the processing routine from 139 SLOC of VB.NET to 118, stripping out blank lines and block delimiters in each case. In addition, I added significant amounts of error handling. The number of variables needed dramatically decreased as well. The speed of execution went up significantly, in large part because the Parallel Extensions Library kept my CPU saturated, while my home-brewed system was running 1 thread per CPU core at best and would often leave a core idle. Overall, not only did I see intense performance gains, but the code itself is much better now. It is much more readable, much clearer, and does not require the maintainer to emulate a multithreaded system in their head in order to "get" what the code is trying to do. Best of all, I stopped relying upon the bizarre "leaving notes for each other in a common spot" method of thread management that I had cooked up.
Overall, I could not be more pleased with the results. When I first wrote this code, I was quite proud of myself because I managed to cobble together a working system that could somewhat make good use of my multi-core CPU. After the rewrite, my pride is coming from having a good, clean piece of code that performs very well. At the end of the day, I would rather have pride in a good piece of fast code than a demented monster that works despite the odds.
The Parallel Extensions Library can easily be applied to the most mundane loop (with the Parallel.For and Parallel.ForEach methods) or group of statements (especially the Parallel.Invoke method). There is little reason not to use it (once it gets a final release, of course) in a great many projects where, previously, adding parallel processing would have been more work than it was worth.
J.JaDisclosure of Justin's industry affiliations: Justin James has a working arrangement with Microsoft to write an article for MSDN Magazine. He also has a contract with Spiceworks to write product buying guides.
———————————————————————————————————————————————————————-Get weekly development tips in your inbox Keep your developer skills sharp by signing up for TechRepublic's free Web Developer newsletter, delivered each Tuesday. Automatically subscribe today!
Justin James is the Lead Architect for Conigent.