Linux

Linux 101: Efficient software management with the Advanced Package Tool in Debian

One of the characteristic features of any Linux distribution is its official software management system. Just as RPM to some degree defined Red Hat Linux throughout much of its history so too has the DEB package format helped to define the Debian GNU/Linux distribution. Explore the benefits of the Advanced Package Tool (APT) package manager for the Debian Linux distribution.

The basic unit of currency in the United States economy is the dollar. There are smaller denominations of money, of course, but nobody really buys anything with a quarter any longer. You can't buy much with a dollar either -- you typically need many of them to make a major purchase. Similarly, the basic unit of software for many operating system environments is the package. There are things smaller than most packages, and several of them might be bundled together in a single package, such as the collection of system utilities collected in the appropriately named Debian sysutils package. There are also a great many packages for which other packages are dependencies: you cannot install just package A, because it needs packages B, C, and D to work.

In the mid-nineties, a condition known colloquially as "dependency hell" was not uncommon with certain operating systems. Two specific examples that developed their own derivative nicknames were dependencies for packages using the Red Hat Package Manager system and Microsoft's Dynamic Link Library system: RPM Hell and DLL Hell. As these colorful -- and potentially offensive -- names suggest, many people found the situation of software dependency issues in the increasingly complex world of operating systems frustrating at the very least.

Package managers were developed for Linux distributions to solve problems such as dependency issues and the difficulty of software installation without a unified package management system. Where the developers of each individual piece of third-party software in proprietary systems such as Microsoft Windows and Apple Mac OS choose and implement the installation method of their own software independently, based on whatever information about software integration with the platform is available, developers of open source software for Linux-based operating systems generally need only write the software itself. At that point package maintainers for various Linux distributions take over and create binary software packages from the source code for those applications. These are then bundled with the distribution, located in online archives, or made available for individual download, and software management applications called package managers are used to install them.

One of the characteristic features of any Linux distribution is its official software management system. Just as RPM to some degree defined Red Hat Linux throughout much of its history, released with Red Hat Linux 2.0 in 1995, so too has the DEB package format helped to define the Debian GNU/Linux distribution.

A brief history of Debian software management

The Debian project, named after its founder Ian Murdock and his wife (at the time his girlfriend) Debra, was announced in August 1993. It is the second-oldest Linux distribution project that is still active, second only to the venerable Slackware. The first appearance of a binary package system in an official Debian release, however, can be identified at Debian 0.91 in January 1994, about five months after the first release version saw the light of day. At the time, it was a rudimentary hack in its early stages of development, and arose within circumstances where the few Debian developers initially envisioned all software management being handled manually through the use of source code files.

Debian 0.93r5 marked the release of the dpkg package manager a little less than half a year before Red Hat Linux 2.0, making Debian's probably the oldest modern Linux package management system in existence, which would be the underlying package management utility of the Debian distribution from then on. Debian 0.93r6 added dselect, intended to be a user friendly front-end to dpkg, though many did not find it to be so friendly.

The original binary package format was specific to the tools provided with Debian, but shortly thereafter the Debian package format was reinvented using the ar file archive format so that its contents could be opened and examined using only standard, trusted tools on any standard Unix system, such as the ar and tar file archiving utilities.

From the very beginning, the deb package format has been designed to support dependency and conflict tracking, and the Debian project has enforced strict rules for package maintenance, to ensure that dependency issues would be resolved and minimized before the end user ever laid hands on the software. It is in large part for this reason that Debian GNU/Linux has earned a well-deserved reputation as one of the most stable Linux distributions.

It also has a reputation for comprehensive, and easy to use, software management. This reputation has developed with the creation of the Advanced Package Tool, or the apt command line tools, released with Debian 2.1 "Slink" in March 1999. APT is famously a defining characteristic of Debian GNU/Linux, and of derived distributions such as Knoppix and Ubuntu.

The organization of the Debian distribution

The year 2000, during the development of Debian 3.0 "Woody", saw the beginnings of reorganizing the central repositories of software packages maintained by the Debian project. Ultimately, this reorganizing would result in the current Stable, Testing, Unstable, and Experimental distributions that are familiar to many Debian users.

While the Stable branch is the eventual goal of the rigorous testing process of the Debian project, and only by reaching a point of stability that qualifies it to be moved to Stable does a Debian distribution configuration get an official release number assigned to it, many Debian users make daily use of the Testing, and even Unstable, distribution branches without negative effects. Some might even say that Debian Stable has specific uses to which it is best suited, and there are other purposes for which Testing or Unstable are better, even in production environment deployments such as the corporate enterprise. Though its name is perhaps a bit worrying, the unstable branch of Debian development is as reliable and usable as many stable releases of other Linux distributions.

The three main branches of the Debian GNU/Linux distribution are defined by the amount of testing the packages included with that distribution branch have undergone:

  • There are several Experimental distributions of Debian GNU/Linux at any given time, and once a newly included software version has been packaged and tested sufficiently to ensure it works without obvious and prohibitive defects, it moves into Unstable.
  • In Unstable, packages are tested in concert with the rest of the Unstable distribution for compatibility and reliability. After ten days, or once all issues are resolved, if any arise during that time, packages move into Testing.
  • The Testing distribution is the beta testing environment for software that is intended to eventually move into the Stable distribution. Over time, new software is added and extensively tested for stability, reliability, and interoperability. Eventually, the addition of new software and new software versions is halted, and the focus moves to resolving final issues. Once the current Testing distribution is fully tested to the satisfaction of the core Debian team, and not a moment sooner, with no need to answer to a vendor release cycle that might otherwise force premature release, the now-frozen Testing distribution is moved into Stable release. Most home desktop Debian users find Testing to suit their needs admirably.
  • Debian Stable releases are given a release version number. Official Debian project policy is for no new software version updates to occur -- only bug and security updates for Debian Stable packages are added. The Stable release is the most commonly used Debian distribution branch for production server environments were rock-solid stability and dependability are of paramount importance.

Using the APT system for daily maintenance

There are three utilities that will cover everything most Debian users should use for command line software management -- apt-cache, apt-get, and debfoster.

  1. apt-cache is the utility used for searching for information about available packages. It is called "apt-cache" because it works with a locally stored cache of the contents of Debian's software archives. The apt-cache utility can be run by normal user accounts under standard system configurations. There are several commands particular to apt-cache that are of frequent use to Debian sysadmins:
    • apt-cache search string, where "string" is a search term, will show you a list of package names in the cache that include that term, whether in the name or description of the package. The search term is not case sensitive, so you do not need to be particular about capitalization when using the command.
      More than one search term can be used, in which case the command will show a list of packages that contain all the specified search terms. If you want to be sure that a specific term appears in the name or brief one-line description of a package whose name is displayed by apt-cache search, you may combine the command with grep by entering apt-cache search string1 | grep string2. In this case, string1 acts as described above, but only those results that match string2 in the brief output of apt-cache search will be displayed, and string2 is case sensitive unlike string1. This behavior can be modified by using command line options for grep, as described in its manpage.
    • apt-cache show package, where "package" is the name of a specific package, will show a more complete description of a given software package than apt-cache search. It is common to search for a package that performs a given task using apt-cache search, then to read more about it using apt-cache show. This command will provide information about the size of the package, what software it installs, what other software will be automatically installed if you use APT to install it, and other, more esoteric information.
    • apt-cache -h provides a short help message with information on usage of the apt-cache utility.
  2. apt-get is used for communicating with software archives and to perform installation and uninstallation actions. It is called "apt-get" because it typically needs to "get" packages from the archives to perform as designed, and because it grants the ability to make system-wide software changes the apt-get utility can only be run by the root account under standard system configurations. Like apt-cache, this  utility provides several commands that are often useful to Debian sysadmins:
    • apt-get install package is used to install programs from the Debian software archives, where "package" is the name of the software package that is being installed. The APT system automatically resolves dependencies and will take care of installing other packages that are needed to complete an installation. If any other packages need to be removed or added to resolve dependencies, apt-get will give you a list of such changes that need to be made and ask you if you wish to proceed. More than one package at a time may be installed in this manner, by simply listing more package names, separated by spaces, as in apt-get install package1 package2 package3. You can simulate an install operation with the -s command line option, if you wish to see what would happen without doing an actual install. To do so, enter a command such as apt-get -s install package.
    • apt-get remove package does precisely the same thing as apt-get install, but in reverse. Instead of installing, it uninstalls; instead of adding dependencies for the package, it removes packages that depend upon it. By using the --purge command line option, as in apt-get --purge remove package, it will even eliminate configuration files specific to the package being removed.
    • apt-get update performs an update of the cache used for apt-cache operations so that the most recent state of the software archives is used. A lot of apt-get update operations over time can build up some unnecessary clutter on the hard drive, as old cache data is left behind, though not enough that most sysadmins would ever notice. The apt-get autoclean provides a fast and easy way to clear out that clutter, however, and it is a good idea to use it after every update of the cache if you want to waste as little hard drive capacity as possible.
    • apt-get upgrade does a system-wide check against the package archive to determine what software in the archive has moved to a newer release version or had security patches or bug fixes applied. It then provides a list of what changes can be made and asks whether you wish to proceed with the upgrade. It is a good idea to do this regularly, even daily, to ensure that no security patches are missed -- but if you haven't done an apt-get update recently, your local cache will not show anything new, so apt-get upgrade is best used immediately after apt-get update to ensure you have the most recent software archive data in the local cache.
    • apt-get -h gives the user a short help message, as the -h option does for apt-cache.
  1. debfoster is a front end for both apt-get and apt-cache that unifies their separate capabilities for a very specific set of operations. It maintains a database of installed software that is managed by APT on the system, and allows for easy viewing and management of installed software.
    Though it is a very useful tool, it is not part of the core APT set of utilities, so it must in many cases be installed separately on your system by use of the apt-get install command. When debfoster is entered at the command line, it goes through major software packages managed by APT and asks what you want to do with them. In addition, a listing of other installed packages on which it depends will also be shown. In each case, you can give one of several responses, some of the more common of which are described here:
    • n removes the package as though the apt-get --purge remove command had been issued for that package.
    • qis the fail-safe option, "quit". If you have accidentally given an unintended response to a debfoster prompt, you can use q to exit the debfoster program without finalizing any of the decisions you've made.
    • s causes debfoster to skip the current package. This neither uninstalls the software nor causes it to be removed from the debfoster list of packages that are in question. This can be useful as a reminder if you think you might later want to remove the package, or want to research it in more depth before giving a final answer.
    • y will cause the package to be kept, along with all its dependencies. Until something changes to alter the state of the package on the system, you will not be asked again about this package, even if you run debfoster again. It is possible to start debfoster so that it shows all packages it manages -- even those you have previously told it to keep with the y command -- by using the -n command line option, as in debfoster -n. You can also choose to keep a package by simply pressing the Enter key, because y is the default answer.
    • ?, the question mark, produces the same output as apt-cache show without kicking you out of the debfoster program. This allows for at least some limited research on a given package while using debfoster in case you are not certain whether you want to remove a given package from the system.
    • h shows help information to give guidance in usage of the debfoster program.

Other software management utilities and applications

The apt-cache, apt-get, and deborphan utilities all provide other capabilities than those described above, and you can read more about them in the manpages for each of those programs. There are other utilities available for more advanced uses of the APT system as well, including apt-file, apt-key, deborphan, and dpkg.

The aptitude utility provides an alternate, somewhat unified, command line interface to APT, and has some different default behaviors. A console-based captive interface for aptitude can also be invoked to browse through packages and choose some for installation. A graphical user interface is provided by the Synaptic package, for users who wish to be able to search and browse packages with a mouse rather than a keyboard.

More, or fewer, or simply different, software archives can be specified for the APT system to use by way of the /etc/apt/sources.list configuration file. While Debian systems do not include the /etc/apt/apt.conf file by default, one can be created to indicate specific software versions that should be installed on the system, so that apt-get upgrade operations will not undo something you have carefully decided must be done in a specific manner on your system.

The Debian APT system is a powerful, flexible, highly configurable software management system that is quite easy to use. Some would say it is easier to use than any other software management system ever created, and even goes so far as to encompass management of software installed from source code, as long as the APT system is used to initially install it. Learning to use the APT system is a key skill for efficient and effective administration of Debian GNU/Linux systems, and the capabilities afforded its sysadmins by the Advanced Package Tool is much of the reason for Debian's popularity in the open source community.

About

Chad Perrin is an IT consultant, developer, and freelance professional writer. He holds both Microsoft and CompTIA certifications and is a graduate of two IT industry trade schools.

1 comments
pfarrell
pfarrell

Nice discussion on the history of Debian. The apt system is the defining characteristic of all Debian distributions and is the best package management system period. It is the only system that is better than the Microsoft method of application management. There, I said it. RPM sucks. Yum sucks. Emerge is cool, but Gentoo takes too much time for a production rebuild unless you're a real bad ass. PF http://patf.net/blogs/

Editor's Picks