Sun Microsystems' Solstice DiskSuite is a powerful disk management metatool that can be used to configure RAID 5 (disk striping with parity) on Solaris servers.
In this Daily Drill Down, I will show you how to configure software RAID 5 on a Solaris server 8.0 installation (with enough space reserved for the initial state databases and the RAID metadevice) using only DiskSuite command line tools.
Note: If you need to know how to deploy DiskSuite, see my article “Install DiskSuite and be on your way to simple disk administration.”
Databases and metadevices
The initial state databases store the RAID 5 configuration and state information on your DiskSuite configuration, while a metadevice is a group of physical disk slices that appear to the system as a single logical device.
Solaris RAID 5 requirements
To deploy RAID 5, you’ll need two slices per disk to create the initial state database and three physical disk drives.
Unavailable partitions for RAID 5
Keep in mind that you can't use RAID on the /, /usr, or /swap partitions.
The initial state database slices (remember, a “slice” is Sun jargon for a “partition”) can be a part of one of the slices that will make up the metadevice. You need to reserve space during your initial install to create the state databases or make room for the state databases using space from the swap slice. Each initial state database occupies 517 KB or 1034 sectors. If you intend to have one of the installed slices be part of the RAID 5 metadevice, then you'll need another partition (or slice) to which you’ll save the data (or the data will be destroyed during the creation of the metadevice).
Three disks are required to maintain the minimal number of state databases after a disk failure. Three disks are also required to allow RAID 5 to continue functioning. The system will not reboot without one more than half the total state database replicas present, so by putting two on each of three disks, you will have the required four to allow the system to boot in multiuser mode.
Your three disks do not all have to be the same size, but the redundant slices on the different disks do need to allocate the same space, or you'll just end up with wasted space on the metadevice.
In my particular configuration, I've got two 2.1-GB drives in the Sparc20, as well as a 2.14-GB external drive. The sizes of the drives used in my example were chosen for speed. In the enterprise setting, these drives will be much larger, so the process will take much more time.
Before setting up my metadevice slices, I needed to look at the current filesystem setup and determine what I wanted to include in the RAID 5 setup and the space needed. The existing filesystem can be determined using the df -k command, as shown in Listing A.
Prompt the OS to recognize new drives by running the probe-scsi-all command at the OK> prompt. To get to the OK> prompt (also known as the PROM firmware), enter the [Stop][A] key combination or run the shutdown -i0 command, followed by the boot -r command.
Listing B shows the details of each partition created for this installation.
Since I can’t include /, /usr, or swap on my metadevice, the goal is to move /export/home,which will become my end user’s home directories and data, to the RAID 5 metadevice, so I've created identical 1.97-GB slices on the two additional hard drives (labeled ALPHA and OMEGA).
I allocated initial state database slices on extra slices on all three of the RAID drives. I arbitrarily set aside about 2 MB on each drive, which is more than enough for the state databases, and hardly enough to be missed. Since the geometries of the drives being used are slightly different, they are not all exactly the same size.
Since I have extra space on the first RAID drive, I can copy /export/home to this slice temporarily to create the RAID 5 metadevice. Otherwise, the existing data would be lost. Alternately, you could back up /exports/home to tape and restore to the metadevice or create the metadevice on another slice. Since I want to distribute the RAID 5 metadevice across three physical disk drives for maximum protection, I’ll want to include the existing /export/home slice in my metadevice.
Creating the filesystem
So far I have made the assumption that the RAID 5 creation is on a machine with existing data (hence the copying of /export/home above). The creation of RAID 5 is not limited to existing installations. If this is a clean install and /export/home doesn't have any data yet, you can just create the metadevice over the existing slice (remember to unmount the partition first). You can now create a filesystem on the spare slice with the newfs c0t4d0sl command. After running this command, you will see the output as shown in Listing C.
Create a new directory where the new partition can be mounted with the mkdir /mnt/home command. Mount the new partition with the mount /dev/dsk/c0t4d0sl /mnt/home command. At this point, you’ll want to tar and copy the current /export/home to /mnt/home with the commands:
tar -cf - . | (cd /mnt/home;tar -xf -)
Initial state databases
Now it’s time to create the initial state databases. Use the metadb command with the –a (add), –f (force), and –c (create) flags to create two initial state databases on the three slices you reserved earlier, as shown in Listing D (the output of the last command is shown as well).
If you don’t allocate enough space for the initial state databases, you'll get a space-allocation error instead of the output listed in Listing D.
Creating the RAID 5 metadevice
Since you are going to be destroying and reformatting the /export/home slice when you create the RAID5 metadevice, you don’t want to have it mounted by the OS during the process. Unmount /export/home and create the RAID 5 metadevice. Make sure no user is logged on and using the /export/home directory. If no one is logged on, run the commands shown in Listing E.
When the RAID 5 metadevice has been successfully created, the prompt will return d55: RAID is setup. Since you didn't specify an interlace, d55 uses the default. The system verifies that the RAID 5 metadevice has been set up, and begins initializing the metadevice. You must wait for the initialization to finish before you can use the RAID5 metadevice. The metastat command can tell you what state the metadevice is in:
Initialization in progress: 0% done
This initialization will take a few minutes, depending on the slice size and the speed of your system. You'll need to check periodically (by using the metastat command) to see when the state says “Okay,” before continuing. If you use the metatool, it will periodically update the status (without user intervention).
Creating the metadevice filesystem
The next step is to create the filesystem on the metadevice. The net size will be (n - 1)*slice_size (where n is the number of slices used for the metadevice). So a 3 slice x 2 GB, will render 4 GB of usable space (or a 4 slice x 2 GB would render a 6-GB space). There will always be a single slice of space lost to storing parity data. This utilization is better than the 50 percent actual data space a plain RAID 1 mirror would offer.
One drawback is that the initial filesystem creation on the RAID 5 metadevice will be considerably slower when running the newfs command on a normal disk. The creation of the filesystem is shown in Listing F.
Restoring and auto mounting the data
A few moments ago, I showed you how to move the /home/export directory. It’s now time to restore that data. To restore /home/export, first move to the home directory with the command cd /mnt/home and then unpack the backup with the command: tar -cf - . | (cd /export/home; tar -xf -). The directory is now unpacked and located in the proper place.
To make this RAID 5 device available at boot, add an entry to /etc/vfstab so the RAID 5 metadevice is mounted at boot. To do this, you need to make the changes to /etc/vfstab as shown in Listing G.
Reboot the server and make sure that your new metadevice is mounted. You can check the state of your RAID 5 metadevice with the metastat command. This command should report that d55 status is Okay as shown in Listing H.
Take note of the Hot Spare column in Listing H. You can set up/assign additional slices to act as hot spares, which, when associated with a RAID 5 metadevice, allow DiskSuite to automatically substitute them for failed slices. The process is much the same as creating the RAID 5 slices: Create the slice with the proper format and then use the metainit command.
Make sure your hot spare is on a different drive separate from your RAID 5 slices and then follow these steps:
- · Create the hot spare pool with the metainit hsp001 c0t4d0s1 command
- · Associate the spare pool with the metadevice with the metaparam -h hsp100 d11 command
- · Check the status of the metadevice with the metastat command
Removing a hot spare pool
To remove the hot spare pool, use the metaparam -h d55 command.
DiskSuite vs. command line
DiskSuite is a powerful and simple-to-use metatool that can make the creation of various file systems a much easier task. But in the end, DiskSuite by itself doesn’t offer compete control—and, like most UNIX system administrators, you want ultimate control over your tools. The quickest means to the control is via command line, and that’s where RAID 5 steps in.
Sure, creating RAID 5 from the command line isn’t necessarily the path of least resistance, but it will allow you to create RAID 5 on a server that has no metatool available and it will ensure that you understand exactly what is happening to your server as the RAID 5 solution is implemented.