GNU is not Unix: How the GNU Project has gone astray

The GNU Project is a famous attempt to supplant commercial Unix systems, but it is not Unix.

In 1969, a small group of developers at Bell Labs created a new operating system whose descendants have become one of the most widely used OS families in the world. That operating system was Unix, which ultimately became the technological basis for the Internet. Because of a consent decree in an antitrust case, AT&T was prohibited by law from engaging in the computer business. In the 1970s, AT&T basically gave Unix to academic institutions for free.

The University of California at Berkeley became one of the biggest centers of Unix development, along with Bell Labs itself. Thanks to the heavy research and development occurring at Berkeley, Unix effectively split into two separate operating system families. One was the Berkeley Software Distribution, or BSD Unix -- and the other eventually evolved into what AT&T called System V, or SysV Unix, for short.

Somewhere along the way, the UNIX trademark (all capital letters, somewhat distinct from the term Unix) became the mark of a commercial certification of an OS as conforming to a standard. While only certified OSes may use the UNIX trademark, which means that most BSD Unix systems are not technically UNIX systems, the major BSD Unix OSes most certainly are Unix. Aside from official UNIX systems and other Unix systems, however, there are also other Unix-like OSes.

In the 1980s, people started developing clones of Unix. One of the earliest was MINIX, initially developed as an instructional tool, though MINIX 3 is a much more ambitious project. Eventually, the most popular clone appeared, under the name Linux.

Another one of the earliest Unix cloning projects was the GNU Project. Core utilities, compilers, and other accessories for a free Unix clone were created, but Linux-based systems leveraged those tools to produce a complete OS before the GNU Project ever got around to developing its own OS kernel.

The tools that the GNU Project has developed have become important components for a lot of operating systems over the years, however -- most notably Linux-based systems. As a result, they have become very widely used, and some of them have even been ported to Microsoft Windows OSes. GCC, the GNU Compiler Collection, is one of the most widely used compiler packages in the world. Even the major modern BSD Unix systems have used GCC heavily for a number of years, though they have been starting to move away from it recently.

The GNU Project is hailed by a lot of people as the genesis of the entire free Unix-like OS ecosystem, the reason we even have open source OSes like Linux-based systems. Even though the focus is on giving credit to the GNU Project, this is a very Linux-centric view of the world, however. The truth is that other free Unix-like systems existed to varying degrees at the time, and fully open source Unix-like systems would surely exist now regardless of whether the GNU Project itself ever existed, notably including the MINIX and BSD Unix systems.

There have been some criticisms of the GNU Project's toolset over the years. The project is regarded by some as very exclusive and unlikely to accept outside contributions, for instance. The source code of some of its applications, such as GNU Screen, is generally regarded as buggy and unmaintainable. The GNU Project's departures from expected behavior for Unix tools is a major sticking point for a lot of Unix users as well.

General cases of annoying variations from standard Unix tools include many instances of basic command line options being changed, renamed (usually to a longer or more arcane option syntax), or even eliminated, sometimes in favor of something generally less useful. Developers notorious for their dedication to clean, stable, secure software design, and still more well-known for their loathing for its opposite, have some unflattering things to say about GCC, including OpenBSD founder Theo de Raadt. While we are at it, let us not forget the absurdity of the GNU su security model, which seems to be "We don't need no stinking security."

Some of the problems people encounter with GNU tools are actually intentional incompatibilities with standard Unix tool behavior. One might easily come to the conclusion that someone in the GNU Project is consciously pursuing something akin to Microsoft's "embrace, extend, extinguish" strategy. The direction of development for specific tools, and the very existence of the tools themselves in some cases, have over the years served as handy evidence for such a claim. For instance, GNU Emacs has expanded to embody so much functionality that jokes to the effect it is a fine operating system but the speaker prefers Unix have gotten less and less absurdly joke-like, as Emacs has absorbed more and more functionality that people associate with their OSes. Another example is GNU Info, a byzantine, perversely designed, user-hostile help page system with a bad case of featuritis that is meant to replace the venerable Unix manpage.

Doug McIlroy is an engineer, mathematician, and programmer whose contributions to the development of Unix as we know it are so widespread and fundamental to the Unix experience that he can reasonably be regarded as one of the founders of the Unix operating system tradition. He actually invented the Unix pipeline:

ls -l | grep Sep

That vertical bar symbol is called the "pipe" character in the context of the Unix command line. It provides an incredibly elegant, simple way to pass data between command line tools. That is what he invented. In this case, it sends the output of an ls -l command (long listing of directory contents) to the grep Sep command (filters out everything that does not contain "Sep", the indicator for the month September in ls -l output). You could easily connect other commands into the pipeline:

ls -l |grep Sep | sort | less

It was probably Doug McIlroy who first articulated a Unix philosophy, and his explanation is generally accepted as the definition of the Unix philosophy:

Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.

The first sentence of that is by far the most often quoted. By contrast, tools like GNU Emacs, GNU grep, and GNU Info violate that principle -- that tools should do one thing and do it well -- many times over without breaking a sweat. It is true that the guys at complain about Berkeley's cat utility, rightly pointing out that command line options like -v (which visibly prints representations of normally non-printing characters) are not part of cat's actual primary use: concatenating the contents of files. The Berkeley version of cat used on FreeBSD has seven options, at least six of which do not serve the core purpose of cat. The GNU version, however, offers twelve options, according to the manpage, showing a distinct inflation of the feature creep problem as compared with the Berkeley version of the tool. Of course, maybe there are more that are only documented in the Info page for it. That sort of hidden documentation problem is common with GNU tools.

GNU tools abandoned any meaningful sense of the Unix philosophy quite thoroughly, a long time ago. Maybe it really is for the best that someone else, namely Linus Torvalds, deflected the early development of a GNU operating system by providing a kernel outside of the GNU Project's direct control to be used with those tools, because the Linux community's desire for a free Unix-like system that imitates much of the SysV branch of the extended Unix family has probably served to retard the growth of the contrary GNU philosophy, which would probably have run quite egregiously out of control if a complete GNU OS were released at about the same time.

The GNU Project is aptly named. GNU means "GNU's Not Unix". If what you want is Unix, look elsewhere.