Open Source

Making RPMs, part 1: The .spec file header

In the first of five Daily Drill Downs on creating RPMs, Vincent Danen explains why you might want to build RPM packages for yourself and others and covers the .spec file.


Most people who use Linux distributions such as Red Hat, Linux Mandrake, SuSE, and Caldera are familiar with RPM (Red Hat Package Manager) because those distributions rely on the RPM packaging format to distribute software. RPM is a system of distributing both binary and source packages. Traditionally, Linux users had to download the source code for programs from TAR archives and compile and install the programs themselves. Today, this method is not obsolete, but certainly not preferred by most Linux users, new or veteran.

A bit of explanation
RPM revolutionized the distribution of software for Linux to some degree. Users are no longer required to compile all their own software. A large number of programs come precompiled in RPM packages and installation is as simple as with any Windows program, if not simpler. RPM completely manages the various software packages on your system. It remembers what it installed and where it installed various files or directories. It cleans up everything efficiently when uninstalling, without leaving unnecessary files or libraries to clutter up your system. In short, RPM takes most of the brainwork out of maintaining and upgrading the software installed on your Linux system.

In this series, we are going to take a look at how to build your own RPM packages. Although many programs are available via RPM, some are not, and you may want to build your own packages to keep a clean system. Or perhaps you want to contribute your favorite non-RPM program to the distribution you use. Many distributions allow and encourage users to submit source RPM packages. These are then compiled and made available to other users of the distribution.

We're going to learn a number of methods of creating clean, efficient RPM packages for whatever distribution you choose. I have made over four dozen RPM packages for the Linux Mandrake distribution, and that is the distribution I will be using to illustrate RPM package creation. Distributions differ, and so does making RPM packages for various distributions. File placement, directory placement, available tools, and so on are largely dependent on the distribution you use. But the basics of RPM creation are the same, so even if you don't use Linux Mandrake, these tutorials will help you create RPM packages for any distribution, provided it supports RPM.

Installing RPM and associated tools
If you use an RPM-based distribution, you already have RPM installed. RPM is necessary for the management, installation, removal, and creation of RPM packages. Other tools you will want to install, for the Linux Mandrake distribution, are rpmtools, rpmlint, and spec-helper. These tools are new to the forthcoming Linux Mandrake 7.1 distribution and may work for any other distribution. They will work flawlessly under Red Hat, but you might have to rebuild them from source for other distributions.

rpmtools is a package that provides various tools to help you build solid RPM packages. It will work transparently with RPM when building new packages, so there is nothing for you to learn to make use of it. rpmlint is a program written in Python to check binary RPM packages for errors. Although rpmlint is tailored specifically to Linux Mandrake, it can be adapted to work with any other distribution by editing the rpmlint Python script itself. Finally, spec-helper is another set of tools that works transparently with RPM when building packages and will automatically compress man pages and info pages, clean files, strip files, and so forth.

These tools will only work with the RPM version that Linux Mandrake comes with, so if you want to take advantage of these excellent tools, you will need to upgrade the version of RPM on your system to the RPM version that Linux Mandrake provides. Don't worry; it won't break your existing RPM system. The RPM version that Linux Mandrake provides does everything that other RPM versions do, but is enhanced to take advantage of these various tools.

You can download these tools and the Linux Mandrake version of RPM from any of the mirror sites at Linux Mandrake mirror site or from rpmfind.

RPM directory tree
RPM uses a directory tree for building packages. This tree is located in the /usr/src directory. For Linux Mandrake it's in /usr/src/RPM, while in Red Hat it's in /usr/src/redhat. The directories below this root RPM directory all have a specific function.
  • The /usr/src/RPM/BUILD/ directory is where RPM uncompresses source packages and compiles the binaries.
  • The /usr/src/RPM/RPMS/ directory contains subdirectories with various build architectures, like i386, i586, i686, noarch, and so forth. When you build an i386 RPM, it will be placed into the /usr/src/RPM/RPMS/i386 directory.
  • The /usr/src/RPM/SOURCES directory contains all the source files and patches for your RPMs. Anything you want to include in your source RPM archives must be placed in this directory.
  • The /usr/src/RPM/SPECS directory contains the .spec files for your RPMs. The .spec file is the configuration file for building, installing, and uninstalling your RPM packages.
  • The /usr/src/RPM/SRPMS directory contains all the source RPM packages that are built. The source RPM package contains the source files and .spec file which are used to rebuild RPM packages.

Defining the RPM header
As you now know, the .spec file is the configuration file for all the RPM operations concerning a single RPM package or package family. With a single .spec file, you can create multiple RPM packages. (For example, you can create a program RPM and a program-devel RPM, where the first package contains the actual program, and the second package contains any programming-related files that might be required by other programs needing those libraries or header files.) The .spec file is the meat of the RPM package you are going to create. It handles patching source code, building source code, installing binaries and libraries, packaging and compressing files, and so forth. Without the .spec file, there would be no RPM.

Enter the /usr/src/RPM/SPECS directory. This is where we are going to create the .spec file. For this tutorial, we are going to build a simple RPM for a Perl script called rpmproc (which, incidentally, is an RPM management script that works very nicely and takes a lot of the brainwork out of the actual building of RPMs). Download the source code for rpmproc from Freezer Burn and save the rpmproc-1.3.tar.bz2 file into your /usr/src/RPM/SOURCES directory. Now, create a new file called rpmproc.spec and edit it with your favorite text editor.

The first thing we will do is make a few define statements. These define statements create variables that can be used throughout the .spec file. For this RPM, we'll use the following:
%define name    rpmproc
%define version    1.3
%define release    1mdk


The first define statement creates the name variable, with a value of rpmproc, which is the name of the program and RPM we are creating. The second define statement creates the version variable with a value of 1.3, which is the current version of the program. The final define statement creates the release variable, which we define as 1mdk. This means it is the first RPM release, and the added mdk means it is a Linux Mandrake RPM. All other distributions simply use the release number without any appended characters. With these three pieces of information we have created the file name of our RPM. The resulting file name will be rpmproc-1.3-1mdk.

The next section deals with the informational part of the .spec file, which becomes the RPM header. This is where we define URLs for programs, authors, distributions, source and patch files, and so forth. For this program, we would enter the header information as follows:
Name:       %{name}
Summary:      Perl script to help manage and build RPM packages
Version:       %{version}
Release:      %{release}
Copyright:      GPL
Group:       Development/Other
URL:       http://www.freezer-burn.org/rpmproc.php3
Source:       %{name}-%{version}.tar.bz2
Requires:      perl-Mail-Sendmail, rpmlint, rpm
BuildArch:      noarch
Buildroot:      %{_tmppath}/%{name}-buildroot
Packager:      Vincent Danen <vdanen@linux-mandrake.com>


Let's take a look at the various fields for the RPM header. The first field, Name, gives us the name of the RPM. We just reference the %name variable for this, since we previously defined it, but we have to do so in enclosed curly brackets. (All defined variables must be referenced this way, and case is important.)

The second field, Summary, gives a brief description of the program. The Version and Release fields are also referencing variables we previously defined. The Copyright field indicates what sort of license this program has. In this case, rpmproc is released under the GPL.

The next field, Group, is very much distribution dependent. Under Linux Mandrake, this program fits best in the Development/Other group, but under Red Hat it might fall into a different group. The best way to determine what groups exist and which is appropriate for the program you are packaging is to fire up a GUI RPM client such as kpackage or GnoRPM which displays all the RPM groups in a tree. These tools are typically used to install and uninstall RPM packages and sort the packages by group, so any group you see listed there is fair game and can be used. Pick the group that is best suited for the application. The format for specifying groups and subgroups is to place a slash between the parent and child groups. (In the above example, rpmproc belongs in the Other group, which is a subgroup of the Development group.)

The URL field is where you place the home Web site for the program (if one exists). If there is no home Web site, you can leave out this keyword.

The Source field contains the source file to build. This file can be a compressed file or any other type of file. The source file in this case is rpmproc-1.3.tar.bz2, which is written as %{name}-%{version}.tar.bz2. The variables are automatically translated, as they are throughout the .spec file. For some programs you may have multiple source packages to include, and you can do this by referencing the first source file as Source0, the second as Source1, and so on. Each Source[x] field should contain the full name of the source file. You can also include a link to the original source file. For example, we could have written the Source field as:
ftp://ftp.freezer-burn.org/pub/custom/%{name}/%{name}-%{version}.tar.bz2

The Requires field contains any RPM package names that are required to be installed on the destination system before this package can be installed. This is where RPM finds its dependency information. In this example, rpmproc requires rpmlint, rpm, and perl-Mail-Sendmail, so before RPM will install this RPM, those three packages must be installed on the destination system. If none of them are installed, you will either have to force the installation with no dependency checking or download and install those packages prior to installing the RPM we are building.

The BuildArch field indicates the build architectures for this package. In part five of this series, we will see how to target various build architectures (like i386 for Intel/386, i586 for Intel/Pentium, and so forth). With this field in place, regardless of what target is specified to build for, only the build architectures specified here will be created. For example, if you attempt to build rpmproc for i386, it will automatically build for noarch, which means it is architecture-independent. You would use this field for programs written solely in Perl, or BASH scripts, etc., and specify the type as noarch.

The Buildroot field indicates where our temporary install directory is to be located. You will reference the Buildroot many times in your .spec file as $RPM_BUILD_ROOT, which is a variable containing the information in this field. This is where you will install binaries and other files associated with your package so that RPM can bundle them all up for packaging. Typically, you want this to be in a temporary directory with a unique name for a subdirectory. In this example, we point to %{_tmppath}/%{name}-%{version}. The %{_tmppath} variable is taken from the building user's ~/.rpmmacros file or the system-wide RPM defaults file. Usually this would be something like the /var/tmp directory. In this case we are telling RPM to use /var/tmp/rpmproc-1.3 as the temporary install directory. You want to tell the install program for your package to use $RPM_BUILD_ROOT as the destination root directory, so when you want the program to be in /usr/bin, you actually install it to $RPM_BUILD_ROOT/usr/bin or, in this case, /var/tmp/rpmproc-1.3/usr/bin. You'll see more about this later in the series when we actually begin to build the RPM package.

Finally, the Packager field should contain your name and e-mail address. This just means that you were the one to build the RPM and put it all together.

Other RPM header fields
There are more fields you can use in RPM headers, and you most likely will need to use them for some RPM packages you create. They all use the same format as those above, which is [field-name]: [field data].

The Distribution field is used to describe the distribution this RPM is meant for, whether it’s a Linux distribution or the name of your Web site (if you build more generic RPMs). Anything is valid, from Linux Mandrake to Red Hat to Joe's Linux RPMs, if you wish.

The Icon field is used to point to an icon representing the RPM package. Some GUI front ends for RPM may display the icon when you select a particular RPM package. The icon should be placed in your source directory and should be either a GIF or XPM file, with a transparent background. There is no need to place a path before the filename if it is placed in the source directory. This icon will also be stored in the source RPM you generate, along with any source files and patches.

The Vendor field is used to designate an organization that created the RPM. This would typically be something like MandrakeSoft or Red Hat or Joe's Linux RPM Web site.

The Provides field is used to indicate a virtual package name to resolve dependencies. Let's take Qmail, for example. If you were to make an RPM for Qmail, you would include:
Provides:     sendmail

This tells RPM that the Qmail package also provides the sendmail package virtually. This is typically used to resolve dependency issues. Some programs may have a Requires: sendmail field, and to install the programs, the user must have sendmail installed. If the user uses Qmail or PostFix or any other SMTP mailer that provides the same functionality as sendmail, the packager for Qmail or PostFix should include an appropriate Provides field to prevent dependency problems.

The Conflicts field is used to indicate a conflict with another RPM package. Using the SMTP mailer example, the author of the Qmail RPM may want to include something like:
Conflicts:    sendmail, postfix

which tells RPM not to install the Qmail RPM if either sendmail or PostFix RPMs are installed. RPM will tell the user that a conflict exists and that they must resolve the conflict, usually by uninstalling the conflicting RPM.

The Prefix field determines the directory prefix for installing RPMs. You may wonder why this field is even required if you are installing to a specific directory. In some cases, the location of binaries is user-dependent, meaning that you and I may choose different locations for something; I might install something to /usr/local/bin, while you may choose to install it to /usr/bin. The Prefix field allows for this selective installing because RPM allows the user to install to a different prefix if they choose. Using this field means you are creating a re-locatable package.

The Patch field is used to specify patch files that will be run against installed source code prior to installing. This allows you to use pristine source files, regardless of whether they need to be fixed (to increase functionality of a program or to fix a bug). RPM will automatically patch files specified in the patch file you include here. You can include as many Patch fields as you like by using the same syntax as in the Source field. The first patch file would be Patch0, the second Patch1, and so forth.

Conclusion
In this Daily Drill Down, we have looked at the first part of building your own .spec file. The header section of any RPM is important and provides a lot of vital data on the program you are packaging. It gives some flexibility and is useful for providing variables that can be used later in the various install, build, and cleanup sections of the .spec file.

In future Daily Drill Downs in this series, I will explore the other sections of the RPM .spec file in detail and provide working examples, tips, and other useful tidbits of information to help you generate quality RPM packages for your distribution of choice.

Vincent Danen, a native Canadian in Edmonton, Alberta, is an avid Linux "revolutionary" and a firm believer in the Open Source philosophy. He attempts to contribute to the Linux cause in as many ways as possible, from his Freezer Burn Web site to local advocacy in his hometown. Owner of a Linux consulting firm, Vincent is also the security updates manager for MandrakeSoft, creators of the Linux-Mandrake operating system. Vincent is a certified Linux Administrator by Tekmetrics.com.

The authors and editors have taken care in preparation of the content contained herein but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for any damages. Always have a verified backup before making any changes.

About Vincent Danen

Vincent Danen works on the Red Hat Security Response Team and lives in Canada. He has been writing about and developing on Linux for over 10 years and is a veteran Mac user.

Editor's Picks

Free Newsletters, In your Inbox