Discussions

Mac Hard Drive Testing Software - How good is SMART testing?

+
0 Votes
Locked

Mac Hard Drive Testing Software - How good is SMART testing?

JackR10
How good is SMART (Self Monitoring Analysis and Reporting Technology) for really finding and predicting drive problems for regular hard drives and now, for SSDs.

Different testing tools for Mac's on the market report different levels of SMART levels. For example, Disk Utility, Scannerz, and SMARTReporter all seem to report what I would call a catastrophic failure level or an "OK" level. You have no idea what the details are.Others like smartmontools, TechTool Pro, and I think Drive Genius (not sure on that) can report a lot more detail with a lot of different parameters.

I had a drive in my system and I was preparing to do a clean install of Mountain Lion on it. I have TechTools Pro and thought I'd check out the SMART status on the drive before installing just to make sure it was OK. All the SMART parameters came back looking good. Note, I'm not blaming TechTools Pro for this, but I think the problem is with problem detection in SMART, as you'll soon see.

I went ahead and started doing the install, and about 20 minutes into it the thing starts tapping. If you've ever seen a drummer do one of those rhythms where they're tapping on the drum rim instead the actual drum head, that's about what it sounded like. After maybe about 30 seconds of this the thing starts screeching like crazy. The install terminated. I put an old Snow Leopard install disc into the system and booted off the DVD, and now Disk Utility is saying the drive is unusable and that SMART status has failed.

If I checked this drive with extensive SMART capabilities just about 45 minutes before and it reported no problems, what good is SMART testing? Since I'm considering replacing the drive with an SSD, is SMART any better on that?
  • +
    0 Votes
    HAL 9000 Moderator

    Self Monitoring Analysis and Reporting Technology

    So it monitors the HDD in Real Time and Reports on the things it is programed to when they happen.

    SMART can not report faults before they happen and with mechanical HDD's that failure may be induced by other factors ranging from Environmental to user induced problems.

    I've seen Student NB's sat on the students beds running and when that happens the air intakes on the bottom of the NB get blocked cooking the CPU and HDD. There is currently no Testing Utility, Method in the world that can prevent user Induced problems from being reported before they cause a problem.

    But on the Up Side it's always possible to recover a lot of Data off a HDD provided you are prepared to spend the necessary funds to do so.

    As an example On Track Recovery Services recovered data off a HDD that was on-board the Colombia Space Shuttle after it suffered a Catastrophic Failure on Reentry and was destroyed. The HDD involved was found at the bottom of a marsh several months after the crash so it had fallen from the disintegrating space craft traveling fairly fast then been exposed to heat damage from the air that was cooking it and trying to destroy it and then after being buffered around for some time it hit water and mud. Not being sufficient it was buried in the Muddy Water for months where it filled up with the crud and was finally rescued and eventually some time latter was handed over to attempt the recovery.

    With any Mechanical HDD provided that the Magnetic Covering on the Platters is there it's possible to recover the data. Not really sure about SSD's as they are a different technology.

    www.krollontrack.co.uk/resource-library/newsletter-centre/ontrack-data-recovery-newsletter/columbia-drive-recovery/

    That's what is actually possible.

    Col

    +
    0 Votes
    Bob OS X

    SMART would not have helped you, and no other tool on the Earth would either. You likely experienced a controller failure which is the circuit card connected to the underside of your drive. What caused it I can only guess, but probably a marginal chip decided to go south, a capacitor failed, etc. etc. What happened is likely that the when the controller failed it sent signals to the drive telling it to move to an extreme where the control arm has no business being thus causing the tapping sound, and then something probably, literally broke off causing the squealing noise.

    SMART is a monitoring and reporting technology, but the monitoring it does focuses on internal drive mechanical performance. Most SMART implementation will not, for example, detect bad sectors on a drive. Tools like Scannerz, TechTool Pro, and I suppose thousands of other tools on the market I've never heard of can detect bad sectors. SMART status only acknowledges a bad sector when a write fails, so you need tools like this to first find the bad sectors and then reformat the drive with a zeroing option to force the drive and SMART to re-map the bad sector to a good sector.

    Even this is problematic because weak sectors which can take seconds to read will not be acknowledged as bad sectors using SMART technology. A weak sector can be read and written to but if I understand it properly, the surface of the drive is magnetically weak and the signal it generates on a read requires several tries to acquire the data. Some drives will not mark such a sector as bad until a timing threshold is reached, and in many cases this can be in terms of seconds. Picture if you will, a cache file that needs to be read 20 times with a weak sector that takes 3 seconds to read. Sounds like spinning beach ball city to me! Scannerz is, to the best of my knowledge, the only Mac application available that identifies weak sectors.

    SMART technology on hard drives isn't worthless, but I would tend to think it's best to use it to pay attention to problems related to mechanical wear and not be alarmed by something like sector re-allocations unless they're growing abnormally. Like I said at the start, you likely suffered a catastrophic controller board failure, and no tool could have picked that up.

  • +
    0 Votes
    HAL 9000 Moderator

    Self Monitoring Analysis and Reporting Technology

    So it monitors the HDD in Real Time and Reports on the things it is programed to when they happen.

    SMART can not report faults before they happen and with mechanical HDD's that failure may be induced by other factors ranging from Environmental to user induced problems.

    I've seen Student NB's sat on the students beds running and when that happens the air intakes on the bottom of the NB get blocked cooking the CPU and HDD. There is currently no Testing Utility, Method in the world that can prevent user Induced problems from being reported before they cause a problem.

    But on the Up Side it's always possible to recover a lot of Data off a HDD provided you are prepared to spend the necessary funds to do so.

    As an example On Track Recovery Services recovered data off a HDD that was on-board the Colombia Space Shuttle after it suffered a Catastrophic Failure on Reentry and was destroyed. The HDD involved was found at the bottom of a marsh several months after the crash so it had fallen from the disintegrating space craft traveling fairly fast then been exposed to heat damage from the air that was cooking it and trying to destroy it and then after being buffered around for some time it hit water and mud. Not being sufficient it was buried in the Muddy Water for months where it filled up with the crud and was finally rescued and eventually some time latter was handed over to attempt the recovery.

    With any Mechanical HDD provided that the Magnetic Covering on the Platters is there it's possible to recover the data. Not really sure about SSD's as they are a different technology.

    www.krollontrack.co.uk/resource-library/newsletter-centre/ontrack-data-recovery-newsletter/columbia-drive-recovery/

    That's what is actually possible.

    Col

    +
    0 Votes
    Bob OS X

    SMART would not have helped you, and no other tool on the Earth would either. You likely experienced a controller failure which is the circuit card connected to the underside of your drive. What caused it I can only guess, but probably a marginal chip decided to go south, a capacitor failed, etc. etc. What happened is likely that the when the controller failed it sent signals to the drive telling it to move to an extreme where the control arm has no business being thus causing the tapping sound, and then something probably, literally broke off causing the squealing noise.

    SMART is a monitoring and reporting technology, but the monitoring it does focuses on internal drive mechanical performance. Most SMART implementation will not, for example, detect bad sectors on a drive. Tools like Scannerz, TechTool Pro, and I suppose thousands of other tools on the market I've never heard of can detect bad sectors. SMART status only acknowledges a bad sector when a write fails, so you need tools like this to first find the bad sectors and then reformat the drive with a zeroing option to force the drive and SMART to re-map the bad sector to a good sector.

    Even this is problematic because weak sectors which can take seconds to read will not be acknowledged as bad sectors using SMART technology. A weak sector can be read and written to but if I understand it properly, the surface of the drive is magnetically weak and the signal it generates on a read requires several tries to acquire the data. Some drives will not mark such a sector as bad until a timing threshold is reached, and in many cases this can be in terms of seconds. Picture if you will, a cache file that needs to be read 20 times with a weak sector that takes 3 seconds to read. Sounds like spinning beach ball city to me! Scannerz is, to the best of my knowledge, the only Mac application available that identifies weak sectors.

    SMART technology on hard drives isn't worthless, but I would tend to think it's best to use it to pay attention to problems related to mechanical wear and not be alarmed by something like sector re-allocations unless they're growing abnormally. Like I said at the start, you likely suffered a catastrophic controller board failure, and no tool could have picked that up.