In a world of commodity processors, it’s hard to justify spending three times the money to get twice the processor. Conversely, time is so valuable these days that any extra power is worth buying no matter what the cost. Regardless of how much extra power you want, there are only two types of environments that make use of a second processor, and those are multithreaded application users and multiapplication users.

Applications that can use two processors
The multiapplication user is the person in your office who has a word processor, spreadsheet, Web browser, and chat client open at all times using them simultaneously. These people generate a significant load on the system from multitasking with simple applications. A dual processor system is a blessing and a curse to these people because they will have to do a lot more work to justify those frivolous, nonwork applications.

Just as rare as those multitasking individuals are the multithreaded applications. However, they are not scarce due to lack of demand. A multithreaded application can perform multiple tasks simultaneously, which cuts their run times dramatically. This is not an easy task. They are incredibly tricky to write; a poorly written multithreaded application will perform worse than a single threaded application if it cools its heels by spinning out threads that sit there waiting on each other. The trick comes into play when sorting out which functions can be done independently, which ones can’t, and when not to bother.

For example, it’s not worth it to write a multithreaded word processor program, because complex functions are few and far between. The entire program shouldn’t take more than 10 to 20 percent of the processor’s potential, making optimization an effort in futility. Furthermore, text data is rendered piece by piece as a stream since there is no positional data other than the word “the” is a t, an h, and an e in sequence. Only page layout programs track the location of each character on a page, and they do it by treating the text as graphic blocks. Without that location information, an application has no choice but to work through the document one character at a time. Thus, there is little opportunity for independent processes.

In contrast, a graphics editor has as many functions as it has filter extensions. And where text files are interpreted linearly, the graphics editor can break up an image to apply filters to various locations or even to the same location without waiting on or interfering with each other. These memory- and processor-intensive applications are some of the most common multithreaded programs today because the processors years ago were so slow.

There is another multithreaded application you will need: an operating system. Without an OS, you get very limited improvement with that second CPU. This restricts you to Windows NT/2000 or a UNIX-like Linux. Windows 95, 98, and Me are all single processor operating systems. If you are using DOS or Windows 3.1, however, I would be curious to know why you are reading this article as technology has obviously left you behind.

Performance gains
In theory, a dual processor system will perform twice as fast. The reality, though, is cruel—the general rule of partial upgrades says you only get half the theoretical boost. In other words, all other things held equal, reduce marketing hype by half. This is due to most dual processor systems getting a 50 percent increase over a single CPU in the same board.

There are many reasons for this imperfect gain. The heart of the matter is mismatched components. A fast CPU upgrade is often limited by old memory, a slow drive, or underpowered video card. Dual processor systems are further limited by wasting cycles when the two processors fight over the same resource. The solution is to properly match components, with the most important being the processors, motherboard, and memory. In some cases, selecting a single component may negate any other upgrade.

Which hardware is best?
The processor specifies the type of motherboard you use. Oh, there are some different optional components, but it will be some time before you have a choice. Pentium 4s only have Rambus memory support, and Athlons use dual data rate (DDR) memory. Rambus has the backing of Intel and is designed for large data transfers. It is opposed by DDR, which has the edge in responsiveness and cost effectiveness.

Both are high-speed solutions providing from 1.2 GB/s of bandwidth for single-channel 600-MHz Rambus RIMM modules up to a maximum 3.2 GB/s on dual-channel 800-MHz RIMMS. Single-channel 1600 DDR provides 1.6 GB/s of bandwidth with 4.2 GB/s of dual-channel 2100 DDR at the high end. Only a single dual processor chipset and board exists for each processor at the moment.

Intel processors use the GTL+ bus, which shares a single access point between the processors. This forces the processors to be nearly identical in order to avoid conflicts and to have larger caches as more processors are added to fill the downtime that occurs when the other processors are accessing the memory or other components. This does increase cost, but it is somewhat offset by Intel’s experience building these processors and chipsets. Rambus memory is actually well suited to this system as it is serial memory, good at transferring large amounts of sequential data with the cost of latency between data requests.

AMD uses a point-to-point bus, based on the EV bus used by Alpha servers. It provides a more flexible system so that both Athlon processors can access data simultaneously without requiring a special high-cache multiprocessor variant processor. Naturally, that rate is limited by the memory or other components’ available bandwidth. Unsurprisingly, DDR is appropriate for the AMD servers as it can rapidly transmit data with few penalties when making nonsequential requests. AMD’s first foray into the world of multiprocessors has just been released. While they do have a multiprocessor-specific variant, it is an expected upgrade to the Athlon that keeps its cache filled efficiently while the processor is otherwise busy taking advantage of the extra DDR bandwidth.

Want more details on processors?

Check out these three articles from our vaults for a thorough background.

“Motherboard chipsets—the good, the bad, and the ugly”

“Understanding your motherboard’s bus system”

“Overclocking 101: Speeding up specific processors”

Cost effectiveness
I implied that a dual processor system could be cheaper than a top-shelf, single CPU computer. I admit that I am a little ahead of the market today—or behind, depending on your perspective. A year ago, you could have purchased an inexpensive motherboard and two Celeron processors for less than a single Pentium III and board. The extra 20 to 30 percent boost from overclocking was just gravy.

Today, a single Pentium 4 trounces a dual Pentium III system. Also, while AMD Athlons are far cheaper than Pentium 4s, the only dual-CPU motherboard is a feature-rich server board; it’s so expensive that it eliminates all the savings (for now).

What you should be on the look out for is a time when a dual processor motherboard and two CPUs of X MHz are cheaper than a single 1.5X-MHz processor and motherboard. It won’t be long. Other less luxurious Athlon boards will arrive this fall that should fill your pocketbook with glee and your computer with processors. Prices on the Pentium 4 are expected to drop from their stratospheric heights as they are adopted as the baseline Intel processor, giving the possibility of cost-effective P4 dual processors.

If you are one of those people who can take advantage of a dual processor system, you should keep your eyes on the market. Life should be good soon. However, please make sure you are going to use that system. I cry every time I hear of a dual processor system used to play solitaire and chat at the same time. If this is you, then put the red jack on the black queen and go call your mother.