Hardware

Troubleshoot memory chip problems

Seldom do memory chips actually fail, but memory-related problems are all too common. Here are some tips to help you solve memory-related errors quickly and easily.

Physical memory chips rarely go bad, but memory-related errors are some of the most common problems faced by the support pro. Learn how to accurately diagnose and troubleshoot various memory problems so that memory errors become less of a hassle and more of a quick fix.

Is it really the memory?
So often, reported memory problems are actually due to software or other component issues. To eliminate this possibility and save myself some time, I ask the following questions before doing anything else:
  • Is the computer brand new? In this case, get your computer vendor to fix the problem under warranty. If this isn’t an option, read on.
  • Were new memory chips recently installed? Check for incorrect or incompatible chips and make sure that they are correctly configured and seated properly in their sockets.
  • Has any new software been installed? If the problems occurred soon after new software was installed, this could be the cause. Make sure the latest patches have been installed. Sometimes just reinstalling software will fix the problem. Since newer software tends to be more memory-intensive, older machines may not be able to handle the increased load. In this case, your only option may be to replace or upgrade the machine.
  • Was new hardware installed or removed? Check the computer’s components for any loose connections and make sure that the new hardware is working properly.
  • Did it happen on a computer that was previously working? Most perplexing memory problems tend to be of this kind. First of all, if the computer does not boot but merely beeps, usually this means that the CPU is not able to communicate with the hardware. Ensure that all components are properly installed and that you have the latest BIOS.

Beware when installing new RAM chips
Incorrect handling of chips can cause electrostatic discharge (ESD). This is where static from your body can damage the chips. Try to hold the chips by the edges, never touch the contacts, and ground yourself often by touching the metal part of your computer. Use a grounded antistatic wrist strap if possible.

True memory problems
Once you’ve eliminated all other possibilities, it’s time to check for an actual memory problem. The following list is comprised of several potential memory problems and how to resolve them.
  • The screen is blank: Check that the VGA card and memory chips are seated properly. Check compatibility between the motherboard and the chips.
  • Not all memory is counted: This often means incompatible RAM chips have been installed. On many machines, chips are installed in pairs. If a new pair is not counted, check for compatibility with the motherboard and/or the existing chips. Error checking and correction (ECC) chips also have the habit of gradually counting less and less when there are problems. If the missing memory count is small, such as less than 10 MB, I usually leave it alone for the time being. After all, a 10-MB loss in a computer with 128 MB of memory is really not a cause for alarm.
  • Computer hangs or suddenly reboots: Check that there is sufficient memory. Check for possible corrosion between each socket and chip. A faulty power supply can also be the culprit.
  • General-protection faults: This is often caused by two pieces of software trying to occupy the same memory address. Rebooting usually solves the problem. If the problem occurs immediately after installing new memory, replace the chips. If the problem does not reappear, check with the manufacturer of the problem chips for known difficulties. I usually fix this problem by making sure that all the chips belong to the same batch from the same manufacturer.
  • Memory errors reported by computer: If you get a “memory mismatch” error, make sure that settings are correct in CMOS (complementary metal-oxide semiconductor). Other errors such as “memory parity interrupt…,” “memory address error…,” “memory failure…,” and “memory verification error…” tend to occur when written information is not read back correctly from memory. The best way to check for incompatible chips is to remove the new chips and see if the problem goes away. If it does, your old and new chips may be incompatible. Install all new chips to solve the problem.
  • Memory errors reported by server system manager: System managers are usually shipped with servers to monitor and report component abnormalities. Unless you are using ECC chips that automatically correct soft errors, a system manager will report a memory error if the rate of soft errors is greater than acceptable levels. I take this problem very seriously, as it can lead to server failure. Replace all the chips as soon as possible. Also make sure you have the latest BIOS.

Conclusion
Since memory problems can be caused by components other than the chips, they need to be resolved through a process of elimination. If quality brands are used, chances are good that you will not encounter defective chips or corroded slots. However, incompatibility, dirty sockets, outdated BIOS, and newly released software will always require your investigative expertise. Hail to your job security!

Rate this article
Now it’s time to tell us how we’re doing. Did you find Kyu’s article helpful? Do you have memory troubleshooting tips you would like to share? Post a comment or write to Kyu Rhee.

 
0 comments

Editor's Picks