Discussion on:

Message 31 of 52
0 Votes
+ -
Re: Which HDD in RAID-1 defective?
If you suspect you have a problem, you can run 'cat /proc/mdstat' in a terminal at any time; it will give you a list of all your arrays, their disks, and their status. A bad disk will cause it to show a RAID-1 array as having only one active disk (or show as having one fewer if you have more than one extra 'mirror' drive).

There's a better method than ad hoc status checks, however! 'mdadm' can email failure notices. I install 'ssmtp', which is a very lightweight mail server that takes the place of 'sendmail' or 'postfix'. It's drop-dead easy to configure, and it can send mail to my Gmail account directly via Google's SMTP server.

Then, if/when there's a problem with any of the RAID-1 arrays in the system, I get an email (causing my smartphone to beep at me; I have Gmail on Android) telling me which system, which array, which disk is bad. (There's also a test mode for 'mdadm' to test-fire an email to you to verify that everything's working end-to-end.)

The other thing I do is configure my systems to allow them to boot up even if any of the RAID-1 arrays is missing a disk (for example, after a power outage that exceeds my UPS battery time; they're configured to reboot when power is restored). I want to the servers to boot regardless, and send me an email so that I know; I then get a replacement disk when I can and minimize overall down-time.
Posted by Brainstorms
11th Aug 2011