Disaster Recovery optimize

Software RAID for Ubuntu LTS: Better to be safe than sorry


I recently experienced a disk failure on a HP DL320 G3 server. The cheap, high capacity SATA disks were not in a RAID-1 mirror because the server's FakeRAID did not support the flavour of Linux in use. Luckily, important data and configuration files were safely backed up, but it's still rather annoying to have to rebuild the box from bare metal again.The faulty disk was replaced under warranty and I was good to go. I've often heard that once a disk has gone bad, it's not unusual for other disks in the same enclosure to follow.  I think this applies more directly to disks in a RAID configuration as they will have experienced the same IO usage. Outside of a RAID, the disks will experience the same environmental conditions, so not wanting to risk yet another rebuild, I decided to look at using software RAID. After Googling for comparisons between software and hardware RAID, I was unsure whether using software RAID would degrade performance significantly. After building a software RAID 1 array and installing the Linux OS, I tested IO using ‘bonnie++'. I was expecting to see a slight decrease in IO performance but was surprised to see quite a large jump. I ran the tests on identical servers, one without RAID and one with SoftRAID. The tests were by no means scientific but they gave me enough confidence in SoftRAID's performance to continue.

Getting set-up

While the prospect of configuring software RAID can seem a little daunting to the first timer, it's actually a very simple process. In this example, I'm going to install Ubuntu 6.06 LTS Server and create four partitions in a RAID 1 array. I will also describe how to check the status of the array.

It seems some people prefer not to RAID swap space; I have chosen to as I notice very little swap usage on my servers and would like to have the disks fully mirrored.

Here, I'm using 21.5GB disks split up in to the following partitions:

/                              10GB

/tmp                      3GB

/var                        7GB

swap                     1.5GB

First boot in to the Ubuntu installation program and continue until the Partition Disks screen. At this point, you need to choose to manually edit the partition table. Select the first disk and create an empty partition table on the disk. Create your partitions as per usual; however, rather than setting the type to the default EXT3 file system, set them to Physical Volume For RAID. This applies for all partitions including the swap area. Repeat this process on the second disk so that both have an identical partition table. Also set the bootable flag to On for the two root partitions. Once you're done, the partition table should look something like this:

Once the partitions have been created move to the top of the Partition Disks menu and choose to Configure Software RAID. The partition manager will ask if it can make changes to the partition tables; answer ‘yes'.

We now need to create a MultiDisk device for each partition set created in the previous steps. Select Create MD Device followed by RAID 1; we want to use two active devices and zero spares. When asked to select the two active devices, select a set of matching partitions. So to create the first MultiDisk device, I have selected /dev/sda1 and /dev/sdb1. Continue this process until all of your partitions have been matched in to pairs and the MultiDisk devices created.

The software RAID devices will now be listed at the top of the Partition Disks menu. These RAID devices can be used just like normal partitions; you will need to edit each one, setting the filesystem type mount point as you would with a standard disk partition. Mine looks like this:

Once you are happy with the partitioning select, Finish Partitioning And Write Changes To Disk and continue with the Ubuntu installation as per normal.

Now that installation is complete, the final step is to install the grub boot-loader on both drives. By default, the installation process will only put grub on the first disk. To do this, boot from the Ubuntu installation disc and select Rescue A Broken System. Continue through the various option screens until prompted to select a Device To Use As A Root Filesystem and then switch to a blank terminal with Alt+F2.

Mount the bootable RAID partition and chroot to it:

# mkdir /mnt/md0

# mount /dev/md0 /mnt/md0

# chroot /mnt/md0

Now enter grub and install the bootloader on to both the sda and sdb MBR -- thanks to the gentoo-wiki for help with this:

# grub
device (hd0) /dev/sda

root (hd0,0)

setup(hd0)
device (hd0) /dev/sdb

root (hd0,0)

setup (hd0)

Notice that we address both disks as hd0; this is because if the first disk fails and you reboot, the second disk will become hd0.

Reboot with shutdown -r now and start up the system as per usual. Now at the command prompt, check the status of each RAID set with:

# cat /proc/mdstat
Personalities : [raid1]
md4 : active raid1 sda6[0] sdb6[1]
4931840 blocks [2/2] [UU]
md3 : active raid1 sdb5[1] sda5[0]
19534912 blocks [2/2] [UU]
md2 : active raid1 sda3[0] sdb3[1]
166015616 blocks [2/2] [UU]
md1 : active raid1 sda2[0] sdb2[1]
39061952 blocks [2/2] [UU]
md0 : active raid1 sda1[0] sdb1[1]
14651136 blocks [2/2] [UU]
unused devices: <none>

As you can see here, all of my RAID sets are clean and active. If one of the disks was failing or being rebuilt, it would be shown here.

I hope this has proven helpful to anyone looking at setting up a software RAID set for the first time. As I become more familiar with the administration of these I will post some tips on monitoring and managing these RAID sets using the standard set of tools provided such as mdadm.

I would be very interested to hear people's opinions on Hard vs. Soft RAID, issues that may have arisen, performance comparisons, and so on. Please leave a comment and share your views.

35 comments
formol76
formol76

hello, I'm trying to instal Ubuntu-Server on RAID1, with this config http://img181.imageshack.us/img181/4771/0002dn4.jpg at reboot: "Operation System not found". I tryed to do this like you wrote: # mkdir /mnt/md0 # mount /dev/md0 /mnt/md0 # chroot /mnt/md0 but I don't have /dev/md0, so I guest my RAID is not mounted? I require help :)

webmaster
webmaster

I LOVE YOU WHO EVER WROTE THIS... LOL This Artical Saved Me 7 Months Of Work When A Data Crash Of Took Everything 56,000+ PHP Scripts 14 GB of MySQL Data. It Wasn't 2 Weeks After I Put A Quad Raid1 In The Production Line Using First This Artical The Systems Were Gone. This Crash Even Affected One Mirror 75 minuits Off Line 50 Mins By Choice That Everything Was Back To Normal Thank You C.E.O. OurFreeInfo.com Aaron St.Clair McMurray

contact
contact

Good Day: Last year I had set up an Ubuntu 6.06 LTS server to host several virtual servers. Not having much money available, I used a Dell OptiPlex GX280 with two 300G Seagate Barracuda 7200.9 drives set up in a RAID 1 array. Last week I checked the uptime of this "server", it had been running continuously for 413 days. Last night someone could not access one of the virtual servers on that box. So I tried connecting to it via VMWare's Server Console. I could not connect, so I thought I'd reboot the host machine. After the POST screen appeared, I received an error message saying "No boot device found." I figured that either a drive crashed or the drive controller failed. After some troubleshooting, I discovered one of the drives in the array crashed. So I removed the offending drive, powered on the host, and the system booted normally. After I receive my replacement drive, I will install it, remirror the array, and life will be good!

shardeth-15902278
shardeth-15902278

and LVM, I'd be a happy camper.I am really beginning to like the simplicity of ubuntu. I was going to put it on my workstation after experiencing some frustration with getting Debian working properly. Unfortuantely. A) I want LVM, so I can change partition sizes as required and B) I want software Raid 0 to boost disk system performance. Sadly the Ubuntu workstation install doesn't appear to support either. Excellent article though. Thanks for presenting it.

K12Linux
K12Linux

I was lucky enough to be privy to a conversation between two Dell server storage engineers and a pretty well connected Red Hat employee. All three agreed that with modern PCs/servers software RAID under Linux performed better than hardware RAID. The explanation was that the processors and RAM on all but the most expensive hardware RAID cards were both pitiful compared to what is in a modern PC. Even with processor hungry apps the amount of CPU time used for RAID was offset by faster data access. Also in servers with multiple disk storage busses, software RAID had an advantage over connecting all your drives to a single PCI card.

lesko
lesko

I did my RAID in software cause I want to go in the cheap so to speak ... George Ou did a review on the G33 motherboard with hardware RAID etc. The question is in hardware RAID are there similar tools under Linux that can do stuff like manually fail a disk and replace it while the server is live or force it to do a rebuild ? I know there are tools for windows but which cheap RAID has Linux utils to do this ? Thanks

ez_tech
ez_tech

Test your settings. GRUB will not handle software RAID too well. Test it with either one of the disks moved out. You will see that it will not work (as far as I have tested in the past) with the first disk removed. For software RAID, I use LILO. It has proven itself countless times. Notice, however, that it is not very likely that in a case of disk failure your system will continue working. It might, and it might not. Because of the chance of reduced availability (which is a possibility), I would recommend to verify that the swap is not on MD device (two swap partitions go through some sort of striping), as this can pose a performance hit when your system is swapping, but your data remains intact even so. Not full availability, but data integrity, which is the more important aspect. Ez.

rarthur
rarthur

I have been running software RAID 1 on a server that started with Red Hat 9 and is currently running Fedora...so a few years now... This setup has survived one drive crash already, but when the bad drive began to fail, it was still spinning and started crashing the server. It was the 3rd crash (within a month) before the system wrote to the system log that FC had received some bogus info from the drive and crashed. During this time, the system was slow and processor utilization high. Replacement was easy, and for 80GB, took only a matter of minutes to re-sync the array. That box has been up again now for better than 140 days without trouble since. I used Webmin for all my disk management after installation and the process was very simple...and didn't require any command line work at all. I did a write-up of my experience rebuilding the array for anyone interested: http://www.speakyourgeek.com/news.php?extend.5

Photogenic Memory
Photogenic Memory

I was thnking of doing something similar with Linux at home. I did this on Windows 2000 years ago and it worked quite well. I did it with 4 drives then with Disk Management. There's lot's of documentation on how to do it under Linux. I hear though that there is added security if you do it with a RAID card instead because it's easier to administrate and repair or replace? Also the controller card may give you options to use more than one RAID setup like RAID 0,1,0+1,3,5, or something totally experimental(usually not recommended). I also thought what was interesting was that you had to enter the Grub menu and reinstall the bootloader. Fun stuff. Anyways, I like this software option. It's cheap, haha! Now the real challenge is to keep those drives cool and functioning for as long as possible.

formol76
formol76

great, it works! the raid module, nvraid, had to be disable in the bios.

Dumphrey
Dumphrey

but its Debian base does is encrypted file systems. Debian allows the set up and use of an entire encrypted file system structure, barring the boot record I would guess. Kinda nice really... lock down that laptop..

Dumphrey
Dumphrey

to create the partitions, then just do "manual" partition and accept whats there. Works like a champ. I dislike the default partitioning Ubuntu does by default. had1 is an all inclusive root partition, and hda5 is a swap partition. the rest (hda2 - hda4) are probably related to LVM, but I have never really bothered to look that close and see.

gladhatter
gladhatter

But as some one indicated the hardware is actually a hybrid and you have to have the drivers installed in advance. This is not a huge problem per se once you figgure out their arrangement. On my box they seem to prefer, the fast trac driver on first and I have always installed the intel first and this is a problem or sorts. Anyways this Raid software is not idea but its better than nothing but an external backup is still the only way to go.

publi
publi

Three years ago I installed on my business a mail server running Debian with a software RAID-1 (on an old machine with IDE disks at least 5 years old). Last year one of the disks crashed. Replacing it was at simple as to connect a new one and typing 3 commands.

gerard
gerard

Hi ez_tech, Just need to put the 1st stage grub data on the second disk and it's on. Mind you i've learned it the hard way .. ;-) The command: Install grub on the second drive in case the first one fails. This will allow the system to boot up. It's important to note that it will boot up in single user mode. You must use metadb to fix the meta database for the lost disk. See notes down below. $ installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/[PartitionOn2ndDisk] Regards, Gerard

Justin Fielding
Justin Fielding

I think hardware RAID is always preferred if available. I would also rather use SCSI or SAS disks than IDE or SATA but sometimes you have to work with whats available.

shardeth-15902278
shardeth-15902278

I found Gparted as a disk, but not Qparted, and I don't see that either application supports LVM creation. Am I looking at the wrong thing? Or did I not clearly explain my need? I am after more than just being able to resize a partition. LVM also allows one to extend (or span) a volume across multiple drives. It is specifically this flexibility that I am after.

shardeth-15902278
shardeth-15902278

hda2-4 are even there? From what I could tell, Ubuntu doesn't support LVM "out of the box". My guess was that the default config is 1 primary partition (hda1) for / and 1 logical partition (hda5) for swap. I will give the qparted disk a try though. Thanks for the tip.

naxnxtzoyzjv
naxnxtzoyzjv

Hardware RAID is great... if you can afford it and guarantee that you will be able to get an *exact* replacement should it hari kari. Different hardware RAID controllers have a habit of doing things very slightly differently. Replacing a Rev1 card with a Rev2 card could eat your data. Software RAID, on the other hand, is the same across all hardware, assuming you know how you've setup /etc/raidtab . One might either setup all their raidtabs the same, or print out the raidtab and tape it to the physical machine. If you don't have active service contracts or a "cold spare" handy, you'll want to be using softraid.

Justin James
Justin James

I used to support NAS devices; some of them were mid-range BSD-based with hardware RAID, the lower end models were software RAID on Linux. Near the end, they introduced an ultra-high end line also using hardware RAID, with Linux. I do not recall ever having a problem with the mid-range or high-end systems just trashing a RAID. It was fairly frequent with the low end systems. The software RAID just was not as reliable, plain and simple, regardless of the RAID level. 0, 1, 5, I saw them ALL fail (particularly 5!) due to software error. They were also not nearly as fault tolerant; the RAID 5's especially had a hard time handling failing disks, I/O errors, etc. Personally, I do not recommend software RAID under any circumstances. Spend the extra $30 on a motherboard with a build-in hardware RAID controller, unless you have a special need like RAID on an ITX motherboard... J.Ja

Dumphrey
Dumphrey

apt-get the tools to extend the installer befor you start installing =) Its not that bad, though I admit, I haven't tried it yet. And even though the Debian installer is ncurses based, it owns the Ubuntu installer hands down, and the Fedora 7 installer is pretty slick as well. Though, Fedora lasted about as long as it took to need to use Yum.. =\ God they (Fedora)need a better package manager.

shardeth-15902278
shardeth-15902278

The Debian and (even more so) Fedora tools have spoiled me. Thanks though, I may break down and give it a try anyway.

Dumphrey
Dumphrey

Yeah I guess it does make sense to start the extended partitions at hX5, that way you know its extended etc... I had just not thought about it that way before. your rational for the standard layout makes some sense as well, though I still find myself puzzled by the logical swap as opposed to a second primary partition. I am sure they have a reason, I just do not know what it is =\

shardeth-15902278
shardeth-15902278

There could be some deeper, fundamental reason for it, based on x86 architecture. Not sure though, been too long since I was that deep in the hardware. Yeah, not sure why the extended partition for the swap, why not just a second primary? Maybe it has to do with keeping to a single simple algorithm, which shoudl work in the greatest number of cases (ie in the case that a person has more than one OS on the drive, and therefore may already have one or more primary partitions, the boot code needs to be in a primary, everything else can go in extended. The thing I never got, is why they didn't separate /home on it's own partition, but then, windows doesn't do this by default either, so they are still offering an equivalent experience...

Dumphrey
Dumphrey

all logical drives start at hdX5? I can see some sense in that, though it was unexpected. But the default Ubuntu layout still does not make sense... hda1 is ext, primary. hda5 is logical swap... no sense..

shardeth-15902278
shardeth-15902278

The labels are actually fairly precise... hda1 = the first primary partition hda2 = the second primary partition hda3 = the 3rd primary partartition hda4 = the 4th primary partition hda5 = the 1st logical drive in an extended partion had>5 = successive logical drives. so you actually won't have hda1-5 on a system as you can have 4 primary and not extended, or up to 3 primary and 1 extended on a drive. you can have hda1 - hda4, or hda1 - hda3 and hda5 - ...(not sure what the limit here is, soem multiple of 2 i imagine, 64? 128?)

Dumphrey
Dumphrey

or something similar because every partition manager I have used labels partitions in order... ie 1, 2, 3, 4, 5 .... so if there is a partitoin 5, there HAS to be a 2-4... Ubuntu may not follow this rule though. Primary partition (85% of disk labeled with ext3 file system hda1) a logical partition for rest of disk (15%) with a swap filesystem (labeled hda5). Following normal naming conventions, hda1 is fine, but the logical partition should be hda2, and the swap partition hda3... this is why I pre-format Ubuntu boxes. it makes my head hurt.

m_ilhami
m_ilhami

I hope that someday Linux Installer will install boot loader on second disk automatically. I hope all Linux distribution do the same thing.

oz penguin
oz penguin

I have had too many RAID chip fail on the motherboards any you can never get that motherboard anymore, they change the technology way too quickly. I have not had any RAID cards fail before, but if one did, I know I will be able to go and get another (brand name) RAID card of the same model. Also have to be a bit of a software RAID fan, but only on RAID 1 (mirrored) drives

CharlieSpencer
CharlieSpencer

Boy, is it ever. I lost two RAID controllers over the years, both in machines over four years old and not in production. It took longer to get the boxes out of and back into the rack than it took to actually replace the card. Software RAID is okay at home or on the cheap, but hardware is the only way to go on a production server. Plus, if you change OSs or have to rebuild the machine for some other reason, RAID is already in place.

Justin James
Justin James

You have a good point there. Hardware RAID fails a small fraction of the time compared to software RAID, in my experience. And, of course, we should all be doing backups. In a nuthsell, RAID is a risk mitigation, nothing more. From where I sit, hardware RAID is so insanely cheap that its overwhelming advantages over software RAID make it worth the extra $30 for a MB that supports it, or up to a few hundred dollars for a RAID card (depending on model, features, etc.). Is it perfect? Not if the hardware fails, which is extremely rare. But the risk has been lowered by many times compared to software RAID. You are right though, professional shops (as opposed to home users) will want to keep spapre RAID cards of the same exact model on the shelf, "just in case", sinced it is faster to replace a RAID card than to restore from tape, not to mention up-to-the-minute. J.Ja

jerickson
jerickson

The writer of the article mentioned the server's "FakeRAID" wasn't compatible with their flavor of linux. That's the problem with the on-board RAID is they aren't a full hardware solution. They are part hardware and part software, and your OS needs to have drivers to support it. I ran into this with a test box, an old Compaq Proliant ML 310 with integrated ATA RAID. Without the correct driver loaded it looked like 2 drives to the OS, so it didn't mirror it correctly (it showed up as 1 after the correct driver was loaded, but I didn't trust it since it was after the OS was installed) not to mention that the performance is horrible (in this particular case). I'm in search of a true hardware RAID controller...