Simpler code: Tips on using Perl's map function and command-line parsing

Find out how you can use Perl's map function for more efficient text manipulation. Then, check out several methods for evaluating switches passed into your program's command line.

By Charles Galpin

Perl's map function can come in handy when you need to simplify potentially repetitive operations, such as capitalizing strings of text. In this article, we'll offer several examples of how you can put map to work. Then, we'll turn our attention to Perl's parsing capabilities with a look at various ways you can parse your program's command line to extract switches or other information.

The power of map
Perl offers many functions that help to simplify and shorten code.

Among the more powerful is map, which takes a list, evaluates a specified block or expression on each element, and then returns a list of all the results. Inside the block, map locally assigns $_ as an alias to the current list item.

One of the simplest uses of map is to capitalize an entire array by applying the uc function to each element:
    @caps = map uc, @phrases;

In the next sample, mapping a regular expression to the array returns the first word of every phrase:
    @first_word = map { /(\S+)/ } @phrases;

Each element need not necessarily map to a single item. If multiple values are created, map returns them all as a single, flattened list. For example, you could split all words in all phrases into a single list:
    @words = map split, @phrases;

Still another use for map might be to convert a string to title case. You can do this by splitting a string into individual words, converting each to lowercase and then initial capitalization, and finally joining the words back into a single string:
    $title = join ' ', map { ucfirst lc } split / /, $name;

Our final example uses map to put the sorted key/value pairs of a hash into a two-column HTML table:
    print "<table>\n";
    print map {"<tr><td>$_</td><td>$hash{$_}</td></tr>\n"} sort keys %hash;
    print "</table>\n";

Command-line parsing
When you need to determine the command-line switches passed into a Perl program, you can take various approaches. An easy way of identifying expected Boolean switches is to loop through @ARGV, setting a flag for each option that is encountered:
    foreach $arg (@ARGV) {
        $a = 1, next if $arg eq '-a';
        $b = 1, next if $arg eq '-b';
        $c = 1, next if $arg eq '-c';

Another simple option is to use Perl's -s switch. In this case, Perl will create variables named the same as each switch and then remove them from @ARGV. For example:
    perl -s -a -b -c

When is executed, the variables $a, $b, and $c are all defined and set to 1. Only the switches listed before any nonswitch argument or "—" will be handled. Therefore, the following may not work as desired:
    perl -s -a -b 13 -c 6/6/2001

Here, $a and $b are set to 1 and @ARGV contains ('13', '-c', '6/6/2001'). To have $b set to 13 and $c set to 6/6/2001, the command line could be entered as:
    perl -s -a -b=13 -c=6/6/2001

A more robust alternative to using -s is to use either Getopt::Std or Getopt::Long. These modules will parse the command line and set global variables (named the same as the switch but prefixed with opt_) for each option. In the following sample, -a is a Boolean option, -b requires that an integer be specified, -c requires a string, and -d accepts an optional string:
    use Getopt::Long;
    GetOptions("a!", "b=i", "c=s", "d:s");

The variables set would be $opt_a, $opt_b, $opt_c, and $opt_d, respectively.

Editor's Picks