Perl is renowned for being a language where you can express complicated commands in a very small amount of space. In this article, following up from my last Perl article, we’ll explore a little bit about how this is possible.

We’ll start with the simplest of programs, which simply reads in characters from the keyboard and repeats them back to the console. In Perl you might write this like so:

while ($line = <STDIN>) {
 print $line;

Even to start this program is quite compact, but what does it do? Simple: <STDIN> is a special file handle, in this case belonging to the standard input buffer (called STDIN), which is usually connected to the keyboard. Each time we assign <STDIN> to the variable $line we take the top line off the STDIN buffer and put it in $line. When the buffer runs out of lines, it returns EOF, which the while statement treats as being false. The rest of the program is fairly self explanatory, now that we have the input line in the variable $line then we use the print function to print it to screen, or more accurately, the standard output buffer (STDOUT), which is usually connected to the screen. Both the standard input and standard output buffers can be redirected, for example to files for storing the output of programs, but if you’re dealing with text it’s usually safe to assume they’re equivalent to the keyboard and screen.

You may think that this program is already as short as it can be, but through using Perl’s special variables, we can make it shorter:

The default scalar variable: $_

Perl has a number of special variables that are automatically assigned in the general course of a program, they can be used to access information about the program itself, such as the name or process id, the command line arguments, or the results of the last regular expression. The most general, and maybe the most useful of these special variables is $_, the default variable. The default variable is where the results of some Perl constructs and functions are put if you don’t specify an assignment, and is used as the argument to certain functions if none is given. This sounds vague and can be confusing until you’re familiar with it, but it can also be powerful. We can use $_ to eliminate the need for the variable $line in our program:

while (<STDIN>) {
 print $_;

This program is equivalent since when a file handle is used by itself in the test of a while statement, it puts its input into the default variable. Then when we print we can just reference $_ to access that input. But we can make this program shorter. Remember when I said that $_ is used as a default argument for some functions if none is given, well print is one of those functions. So we can now write this program as follows:

while (<STDIN>) {

Now we’ve got a program that does the same thing, but eliminates explicit variables all together. Since we’re really just connecting STDIN to STDOUT it would be nice if we could get rid of that while loop, it’s not doing anything interesting except iterating over the buffer. Well, this too is possible:

print <STDIN>;

How this works is a little more complicated. When we use $_ with print, we put the variable into what’s called a scalar context, meaning simply that it is treated as an individual object, such as a number or a character, and not a collection. print can also be used in an array context, meaning that the argument is treated like a list of objects, when used with print this will print each one in turn. When we use the file handle <STDIN> with print in this way it will treat standard input as an array of strings and print them in order, which has the same result of the while loop. It might be an extreme example, but by using a few Perl shortcuts we’ve cut the length of our program in half.

This is fine if we just want to mirror STDIN to STDOUT, but what if we’d like our program to act more like the Unix filter cat, which can open and print files as well. Now we could check the command line arguments and test to see if they’re valid files, open and print them in order, but since this is such a common thing to do Perl has an easier (and shorter!) way.

The special file handle: <>

Like the default variable, the special file handle — written as <> is a short cut in the language added to make programs easier to write. The special file handle treats all command line arguments as file names, and opens each in turn. If there are no command line arguments then it opens and reads from STDIN. As per the UNIX convention, if “-” is given as a command line argument it opens STDIN in place of a file. So if we wanted to write a version of the above program that could support files given on the command line it would be as simple as:

print <>;

When you consider that you can write a working implementation of cat in only eight characters, you can see why Perl is considered so powerful. But what if we want to do something more significant with the input rather than just echo it back to the screen?

Counting Line Numbers

If we want to process the lines of the input individually then it’s not enough to just link the file handle to print, let’s take a look at a simple program to add line numbers to the lines of input:

$num = 0;
while (<>) {
 $num = $num + 1;
 print "$num\t$_";

In this example we use the variable $num to keep track of the line number. For each line of input we increment this number, then print out the number and the line of input together. When we refer to variables inside strings with double quote characters (“) the variable name is replaced with the contents of that variable, this makes formatted output in Perl a breeze.

Even with these simple programs, it’s easy to see how using special variables can make your programs smaller and faster to write. If you’re interested, information about all of Perl’s special variables can be found in the perlvar section of the Perl manual (