Bus mastering adds a great deal of complexity to a communication bus. When bus-mastering components are introduced, any number of system devices might have an interaction problem. Problems with bus mastering can be related to the motherboard chipsets, the BIOS settings, device drivers, or even other components in the system. This Daily Drill Down will focus on the most common problem sources associated with bus-mastering video cards and some remedies to employ when you run into trouble.

Mastering the bus
Normally, all data bus transactions involve the central processing unit (CPU) and some other component. The CPU emits an address upon the bus and spurts out a few control signals to the rest of the system. All the other devices decode the address to see if they are supposed to answer. If the CPU requested a read cycle, for example, the selected device would put the requested data onto the data bus so that the processor could fetch it. Alternately, if the CPU requested a write cycle, the selected device would copy the contents of the data bus into the designated location.

In many data transactions, the CPU is significantly slower than the rest of the system. For example, a hard disk controller will fetch a large block of data from a disk file and then copy this data to a contiguous block of memory. If the CPU should get involved in this operation, it would waste a great deal of time just fetching instructions. The CPU would have to sequence through a program, which would consist of commands to read from the disk controller, commands to write the data to a memory location, a pointer increment, and a loop counter decrement. In the days before L1 and L2 cache, these instructions would have to be fetched from main memory, through the very same data bus that would be conducting the data transfer from the disk. Thus, the full bandwidth of the data bus could not be dedicated to the disk transfer. Even with L1 and L2 cache, the CPU is not optimized to perform disk I/O transfers. For this reason, bus mastering is employed.

When a device wants to be the bus master, it makes a request to the CPU. When the CPU grants this request, the CPU will go idle, which allows the requesting device to take complete control of the bus. The new bus master can then talk directly to memory or to other devices while the CPU sits on the sidelines. When the bus master has completed its task, it relinquishes the bus, and the CPU goes back to normal instruction processing. This complete process is known as bus mastering.

Bus mastering is typically used only when a data manipulation can be performed faster without the involvement of the CPU. Generally, this technique is used for high-speed storage peripheral I/O, as well as graphics applications. Disk I/O is typically implemented with direct memory access (DMA), which is a bus-mastering implementation. Recently, video graphics cards have become so sophisticated that they require bus-mastering capabilities to perform high-speed data manipulations using the CPU’s main memory. Operations such as ray tracing, 3-D surface determination, and surface texturing all require intense number crunching and lots of memory.

Diagnosing bus-mastering video-card problems
Bus-mastering video cards have all the same problems associated with other video cards, but they can also exhibit additional problems as a direct result of their speed, power requirements, and complexity. The increased speed and frequent memory accesses of a bus-mastering video card create a much greater burden on the power supply, which can cause power dropouts. A drop in supply voltage to any component might result in strange behavior or even permanent damage to that component. Also, BIOS parameters, such as the AGP clocking speed and the AGP aperture size, might have a dramatic impact on behavior.

Nearly every video card works just fine in VGA mode upon startup. If the card fails to display the normal BIOS startup screen, however, it may be due to dirty contacts or improper insertion. Remove the card, clean the contacts, and reinsert it. If this doesn’t fix the problem, a physical hardware problem is likely, which you can verify by replacing the video card. If you have an old ISA video card, insert it instead and see if it displays the boot screen. If it does, then you either have a motherboard problem or a video-card problem.

A common complaint is that the operating system won’t recognize the video card. Incorrect BIOS settings, video driver software, or improper IRQ settings are often the culprits in these situations. The BIOS settings that affect Plug and Play operation and/or IRQ settings are likely suspects. Some of the BIOS settings to consider are listed in Table A below, although it’s always advisable to check your video-card vendor’s recommendations.

Table A

More modern graphics cards run at much higher speeds, and they typically require a great deal more power. Over-clocking either the CPU or the graphics adapter will cause the graphics card to run hotter, and it will require more power. This can be the root cause of very mysterious intermittent problems.

For example, if an application occasionally causes a flurry of graphics operations, the card will immediately drain more power from the system. If the power supply isn’t able to source enough power to meet this immediate demand, the voltage will drop in different parts of the system. Such a drop in voltage could cause memory to lose data, and it could result in unpredictable behavior or even damage in other system components. If the supply voltage to a chip drops below the voltage recommended for its inputs, this could reverse-bias the inputs to these chips, perhaps cooking the input transistors and thus destroying the chip. (By the way, this illustrates why you should always turn off the computer before shutting off the monitor. You never want to drive the inputs of a device when the device is turned off.)

Another problem that seems to plague video-card customers is the “leftover driver” issue. A number of video-card vendors recommend removing all the old video-card device drivers before installing a new card because the leftover drivers can compete for resources and prevent the system from accessing the correct driver.

Remedies for bus-mastering issues
If you experience any strange problems during normal system operation, you should immediately check the AGP clock speed. Normally, the BIOS will refer to this as either AGP Turbo or AGP Speed. Usually, BIOS accepts 1X, 2X, and 4X options. Choose the 1X option if you have any doubts about speed or power-related problems. Also, many vendors recommend that you use at least a 235-watt power supply, but they prefer a 250-watt power supply.

Bus mastering introduces a great deal of complexity into the computer. If two devices both want to become the bus master, an arbitration scheme must decide which has priority. Once a device takes control of the bus, it must remember to let go when it has completed its task. If a device were to hang for any reason while controlling the bus, a mechanism must exist to kick the problematic device off the bus, or the system will crash. The operating system and the device drivers should handle most of these issues. Of course, this assumes that all vendors write driver code equally well, which is not often the case.

To ensure that you are running the correct software, I recommend following this procedure:

  1. First, make sure that you have the latest BIOS for your motherboard. Check the manufacturer’s Web site for the latest upgrades.
  2. If the motherboard manufacturer has any special device drivers for their AGP chipset, use them. If not, consider checking the chipset maker’s Web site. Some of the most popular chipset makers are: VIA, AMD, ALi, SiS, Intel
  3. Check to see if the graphics-card vendor has a more recent device driver than the one provided by the operating system.
  4. Make sure no old device drivers still exist. You can eliminate them by removing all video-card-related devices via the Device Manager and then reinstalling the drivers.

You might also try disabling hardware-acceleration functions to determine whether the video card is causing intermittent problems by navigating to Control Panel | Display | Settings | Advanced | Performance. If this causes the problems to disappear, then the graphics card or one of the related software components (mentioned above) is probably at fault.

Vendor recommendations
A number of video-card vendors provide excellent technical support information on their Web sites. If you are having a problem with a specific vendor’s card, you might check out its site to see if the problem is documented and if a solution has been published. Some of the most popular graphics-card vendor sites are:

Each of these sites has technical support information to help solve compatibility problems. Some of the video-card makers provide lists of compatible motherboard makers as well. If you haven’t already bought a graphics card, I would strongly recommend checking out both the motherboard vendor’s site as well as the graphics-card vendor’s site to help you choose a combination that has already been proven compatible.

The role of graphics cards in networked systems
Some lower-budget networks force workstations to double as servers, but when several users start hammering one machine and its performance drops to a standstill, it can greatly annoy the local user. Ideally, you should avoid this situation whenever possible.

I don’t believe in putting fancy graphics cards on servers. Extra complexity is the last thing you want to introduce into a server environment. Why create the potential for new failure modes? Since you want to discourage users from working directly at the keyboard of a server anyway, I would put the most reliable, no-frills graphic card on my servers.

Workstations, on the other hand, serve a very different purpose. Here, you want the best possible video display so as to minimize eyestrain. The more comfortable the monitor and video display, the more productive the user. Thus, a small additional expenditure might yield an additional 10 to 20 minutes per day of productive work. Over the life cycle of the graphics card, this additional productivity should easily justify the higher purchase cost.

Get on the bus
Bus-mastering video cards offer dramatic performance benefits over their predecessors, which lack these hardware acceleration features. These cards can take control of the processor bus and perform very fast graphic computations. Out of this extra complexity, however, arise a number of problems that might be difficult to diagnose. Fortunately, many of these problems have been discovered and solved. Additional technical support can be found on the various video-card vendors’ Web sites.