In part one of this series, I discussed the role that knowledge and experience play in troubleshooting. This time around, I’m going to explain the super geek term “component” and how it factors into the troubleshooting process. It may sound silly, but this word separates super geeks from ordinary computer users. When most users look at a computer, they just see a computer. When super geeks look at a computer, however, they see a group of interactive components. Troubleshooting should be a process of examining the entire system, narrowing in on the problem component, and then repairing or replacing it.

Start from the general
The first step in troubleshooting any system is defining the scope of the problem. To do this, you first have to define the components of the system. The systems we work on have layer upon layer of components. Discrete components such as resistors and capacitors in combination with other components create circuit cards. Circuit cards combine with other components to make computers. Computers combine with other components to make networks.

The layered nature of the systems on which we work creates an inherent hierarchical structure. Hierarchical structures are often analogous to trees. If you want to climb a tree, you don’t start by grabbing a leaf. You start by digging your toes into a hefty piece of the trunk and working your way up and out from there.

The same is true of troubleshooting. Don’t start with individual components—start by looking at the operation of the system as a whole. The process of fault isolation should begin with the broadest scope and the most general of components.

Isolate the problem
Troubleshooting is about fault isolation. It’s about determining which component of a system is causing the problem. While replacing a defective component can sometimes be the answer to a problem, support technicians must avoid the common mistake of troubleshooting a system by always just replacing parts. This type of brute-force approach to problem resolution, while minimally effective, is not very efficient and is not troubleshooting.

Replace the faulty component
When I first started working on computers, my employer (Uncle Sam) had a very inexpensive labor force. Components such as disk drives, on the other hand, were very expensive (and quite large). In that environment, it made sense to troubleshoot systems down to the lowest possible level. We would troubleshoot computers down to the individual resistor, capacitor, or IC chip that was causing the problem.

In most instances, however, it’s simply not cost effective to troubleshoot computers down to the discrete component level. The depth to which you troubleshoot a problem should be proportional to the cost of the component you are troubleshooting. You wouldn’t spend an entire day trying to figure out what is wrong with a video card, unless the cost of that card were more than the cost of a day’s time.

Until next time…
Remember to start the troubleshooting process by looking at the whole system. Then, narrow the investigation until the problem component is isolated. Finally, depending on the cost, repair or replace the offending component.

In part three of this series, I will discuss specific methods used to define the scope of a problem. I will also reveal to you another super geek secret—how the members of the Super Geek Club stack the odds in their favor when troubleshooting problems.

Mike Sullivan is a senior systems engineer with Merge Computer Group, Inc., a Richmond, VA, consulting firm. His credentials include MCSE+I, MCDBA, MCT, and 19 years of IT experience. He is a full-time consultant who occasionally takes time off from his clients to teach.

Are you the IT guru at your organization? Do you have any troubleshooting tips you consistently rely on? We’d like to hear about them. Post a comment or send us an e-mail.