Helping Linux make a leap into the enterprise with ReiserFS

Journaling file systems are becoming accepted as an enterprise standard because of their ability to safely and efficiently recover after a crash. Join Vincent Danen as he walks you through installing the ReiserFS journaling file system in Linux.

You may have heard the term "journaling file system" bouncing around lately and wondered what the fuss was all about. It's not a new term, by any means, but it is relatively new to Linux. Let me explain. Traditionally, Linux has used (and still does use) extended file system 2, or ext2, for its primary file system. This is like the FAT file system for DOS, HPFS for OS/2, or NTFS for Windows NT. None of these file systems, however, is a journaling file system.

ReiserFS is a file system that’s becoming increasingly popular among Linux users. Started in 1993 by Hans Reiser, the file system has a very good chance of becoming a part of the Linux 2.4 kernel, although right now you need to apply a few patches in order to get ReiserFS to play nicely with your kernel 2.2-based system.

As it stands, ReiserFS is known primarily as a journaling file system, but it has the potential to become much more than just that. The author has expressed that the long-term goal for ReiserFS is to move database- and keyword-searching options into the file system itself. The advantages to such a file system should be clearly obvious. Fast file finds are the least of what this potential file system would have to offer. The ReiserFS is modeled after the AS400, because like the AS400, it has a relational database as its core. Combining the relational database nature of the AS400 with the many benefits of traditional UNIX file systems is the hope for the ReiserFS project of the future.

What is a journaling file system?
Due to its complexity, I’m going to give you a very brief overview of a journaling file system. Describing the theory and working architecture behind a journaling file system and ReiserFS itself is beyond the scope of this article.

To say it more sufficiently, a journaling file system uses a database to store information about the file system as opposed to the way a traditional file system (like ext2) uses a program, such as fsck, to check the entire partition. Consequently, after a system crash (or an accidental reset), ReiserFS merely needs to consult the transaction log to determine whether or not there are any problems with the state of the file system.

The term “journaling” comes from the fact that ReiserFS—and any other journaling file system—journals (or logs) transactions by first writing them to a log buffer, then asynchronously writing the log buffers to the on-disk log. After a crash, the ReiserFS recovery tool, reiserfsck, reads the on-disk log. If necessary, transactions are repeated to ensure that the file system remains in a consistent state.

The long and short of this is that if there is a problem—if your system crashed or your cat brushed up against the reset button—the system will restart much faster with ReiserFS partitions than with regular ext2 partitions. Unfortunately, I cannot provide you with details on how much faster simply because I have no more ext2 partitions left to benchmark against. But, there is a remarkable increase in system boot up speed. And, depending on how often the system has rebooted, you may no longer need to worry about periodic fsck checks during boot up. ReiserFS handles all of this quickly and efficiently.

You’ll notice also that you’re no longer limited to file sizes of approximately 2 GB as you are with the ext2 file system. Files and directories can be as large as you want them to be. You’ll suffer a marginal performance hit with larger directories or files, but not nearly as much as you would using ext2 (if the file you’re using would even fit in an ext2 partition!).

Compile ReiserFS into your kernel
ReiserFS is being used on Linux systems as a replacement for ext2. While there are other journaling file systems in the works, ReiserFS is by far the most mature. Not in terms of overall design, however, (after all, XFS has been used on IRIX systems for years) but in Linux usability. ReiserFS can be used by people running Linux now, whereas SGI's XFS and IBM's JFS still have some work ahead of them to be able to port those file systems and associated utilities to Linux. Plus, they must be stabilized in order to be used in production environments.

This, in and of itself, makes ReiserFS very attractive right now. It can be used immediately, and people can see the benefits immediately. While waiting for something like XFS—which may end up being more robust and stable considering the company behind it and how long it has been used—might be an option for some, others don't see the need to wait. ReiserFS fills a need, and it does it very well.

The stages of use for ReiserFS vary from distribution to distribution. For instance, Linux-Mandrake 7.1 was the first version of Linux-Mandrake to include support for ReiserFS. You could format partitions with ReiserFS from installation and happily use ReiserFS partitions during everyday usage. Previous versions, however, could not. With other distributions, this support depends, in large part, on which patches have been applied to the kernels to which they supply. SuSE, one of the primary sponsors for ReiserFS, undoubtedly makes it an option for their latest 7.0 release.

Trying to get an older version of Linux to work with ReiserFS, however, is probably not worth the hassle. Your best bet is to upgrade your favorite distribution to its latest version, and chances are it will support ReiserFS. If you can't do this, you can still get ReiserFS to work on your system, but you’ll need to upgrade your kernel to at least 2.2.11.

Grab the ReiserFS patches from one of the ReiserFS mirror sites. There’s a patch for 2.2.7 kernels, but nothing for 2.2.8 through 2.2.10. I'm not sure how well the 2.2.7 patch works, so I would suggest at a minimum using a 2.2.11 kernel; a 2.2.16 or the latest 2.2.17 kernel would probably be a better choice, however.

Make sure you have the source code for your kernel in /usr/src/linux_2.2.17 (if you're using the 2.2.17 kernel), and apply the ReiserFS patch, which in this case would be contained in the file linux_2.2.17_reiserfs_3.5.25_patch.gz. After downloading the file, apply the patch, as shown here:
cd /usr/src/linux
zcat linux_2.2.17_reiserfs_3.5.25_patch.gz | patch _p0

This will apply the ReiserFS patches to the 2.2.17 kernel and will create a new directory called /usr/src/linux/fs/reiserfs/utils, which contains the source to build various utilities to use with ReiserFS, including mkreiserfs and reiserfsck. I’ll explain how to build these tools in a moment.

The next step is to compile your kernel with the newly applied patches. It’s beyond the scope of this article to tell you which configuration options to select when compiling the kernel, except for one option, which deals with ReiserFS itself. When asked about ReiserFS support, answer y (for yes) or m (for modular) to compile the support for ReiserFS either directly into the kernel or as a module. You can configure your kernel by using any of the following commands:
make config
make xconfig
make menuconfig

The make xconfig command is probably the easiest method to use as it provides an X interface. The make config command is probably the most difficult method and should be used only by people who have compiled their own kernels many times before and know the general ins and outs of kernel compiling. I wouldn’t recommend it for a first-time kernel compiler.

Next, compile your kernel with the following commands:
cd /usr/src/linux
make depend
make clean
make bzImage

Once you’ve successfully accomplished these steps, you’ll need to compile the modules for your kernel. Issue the following commands, one after the other as the previous command completes:
cd /lib/modules
mv 2.2.17 2.2.17-old
cd /usr/src/linux
make modules
make modules_install
cd /lib/modules/2.2.17
/sbin/depmod -a

You may omit some of the above steps if you don’t currently have a 2.2.17 kernel installed on the system. If you’re upgrading from the 2.2.16 kernel to 2.2.17, you won’t need to rename your /lib/modules/2.2.17 directory to /lib/modules/2.2.17-old, as the directory probably won’t exist. The only time you’ll need to do this is when you have a current 2.2.17 kernel installed on the system because you will overwrite your old modules. Be sure to make a backup of them before you begin, though.

At this point, you’ll have a new kernel image located in the /usr/src/linux/arch/i386/boot directory. Copy the file called bzImage to your boot directory, giving it an appropriate name (for example, bzImage-2.2.17 or something similar), like this:
cd /usr/src/linux/arch/i386/boot
mv bzImage /boot/bzImage-2.2.17

You may also want to generate an initrd image file, which will contain some modules required to boot the root file system. For instance, if you want to be able to boot off a ReiserFS root partition, you’ll need an initrd image. You can make it like this:
mkinitrd -f --ifneeded /boot/initrd-2.2.17 2.2.17

This command will create a file called /boot/initrd-2.2.17, which is the image containing the modules you may need to boot your partition (such as the ReiserFS module, a SCSI module if you use a SCSI hard drive, etc.). Next, edit your /etc/lilo.conf file, as shown here:

This code assumes that your root partition is /dev/hdc5 and that your previous kernel was a Linux Mandrake kernel (2.2.17-1mdk). As with all kernel compiling, you should always keep your old kernel available until you can verify that the new kernel works as expected. Once you’ve edited your /etc/lilo.conf file, run this command to rebuild your boot sector:
/sbin/lilo _v

When you’ve successfully compiled your kernel, enter the /usr/src/linux/fs/reiserfs/utils directory and compile the ReiserFS utilities, like this:
cd /usr/src/linux/fs/reiserfs/utils
make dep
make install

After you’ve completed these steps, the ReiserFS utilities will be available on your system, and you’ll be ready to reboot your computer. Obviously, if any one of these steps fails, you’ll need to go back and try it again. Generally speaking, failures with kernel compilation usually come from improperly configuring your kernel prior to starting the compile procedure.

If you have the option, I highly recommend grabbing an RPM from your vendor for the latest stable kernel (2.2.17 or 2.2.16 if you prefer) with ReiserFS support built in and installing an RPM containing the ReiserFS utilities. This is more for people who don't have the time, or the expertise, to compile their own kernel. Compiling a kernel is not a trivial matter, but it is getting better. If you can upgrade your existing kernel to a precompiled RPM-based kernel with the ReiserFS support, so much the better.

Once you have rebooted the system, if all has gone well, you should be at a login prompt. If this is not the case and you encounter an error, go back and follow these steps again. Chances are you selected an inappropriate answer to a question during the kernel configuration. Rebuild your kernel and try it again. This is why we keep the old kernel on the system for a while. In the case of a kernel panic or any other startup problem, falling back to a known and reliable kernel is good practice.

If all has gone well, however, select a spare partition (one that you have either backed up or that contains nothing of importance) and run the mkreiserfs utility on it by using the following command:
mkreiserfs -h tea /dev/hdc1

This will make a ReiserFS-formatted file system on the device /dev/hdc1 (first partition on the first drive on the second IDE channel). The -h command specifies the hash name to use (you can select between tea and rupasov). The rupasov hash (the default) is the fastest, especially for extremely large directories with sequentially named files, but the drawback is that it has a higher probability of hash collisions. A hash collision means that the system will suddenly refuse to create a file with a certain name even if there is enough free space on the disk. The tea hash is a cryptographic hash, but it is slower and has a lower probability of a hash collision. Since data retention should be of primary importance, I suggest using the tea hash, even if you do suffer a small performance hit for using it. If you need to, you can also specify the block size after the device, but if you omit it, mkreiserfs will determine the best block size to use.

Next, create the mount point. In this case, we want to mount this new partition as /files, so we have to make the mount point
mkdir /files

Then, mount the partition using:
mount -t reiserfs /dev/hdc1 /files

You can also add it into your /etc/fstab file, like this:
/dev/hdc1  /files  reiserfs  defaults  1 2

This will automatically mount your new ReiserFS partition as /files every time you boot.

Because the file system is such an important part of your system, you must take care when installing or updating anything related to it. Any mistakes may render your system inoperable or damaged. This is another reason why I encourage you to download or purchase the latest version of your favorite distribution, which most likely will have ReiserFS support already built in. I don't know how much luck you’ll have trying to build a new kernel with ReiserFS support for Red Hat 5.1 or a similar older distribution.

Nonetheless, the advantages of having a journaling file system like ReiserFS make the effort well worthwhile. With lightning-fast boot-up times and more reliable error recovery, ReiserFS is definitely a good choice. The time you would have spent doing an fsck on a 30-GB ext2 partition compared to the time ReiserFS spends to check a partition of the same size will astound you. Quite frankly, once you experience a journaling file system like ReiserFS, you'll never want to go back to something as slow as ext2.