Developer

Building custom subroutines in Perl

Subroutines are prepackaged pieces of code that are designed to help you quickly accomplish common tasks. Perl comes with a bunch of built-in subroutines, such as print(), but you can also easily build your own reusable subroutines.

If you're new to Perl, you may not know what a "subroutine" or "function" is but you've almost certainly used one. Subroutines are prepackaged pieces of code that are designed to help you quickly accomplish common tasks. They can be used (or "invoked") from anywhere in your program, and they add some very useful benefits to your Perl projects.

Every programming language comes with its own built-in functions or subroutines—every time you print() or join() something in Perl, you're actually using a built-in subroutine. But Perl also allows you to define your own custom subroutines, so that you can save yourself some time and effort when performing common tasks. I'll show you why subroutines are useful, and how to create your own.

Advantages of subroutines

Subroutines are convenient for three reasons:

  1. They let developers break up long procedural scripts into smaller, easier to understand fragments. This makes code easier to debug, and simplifies locating the source of errors.
  2. By identifying commonly-used tasks and then encapsulating those tasks into independent packages, subroutines make code reuse a reality. It's not uncommon for many Web developers to maintain their own library of commonly-used subroutines and import them whenever needed to quickly accomplish common tasks.
  3. A subroutine is created once but invoked many times. So if a code update is needed in the future, the changes can be done in one spot (the subroutine definition) while the subroutine invocations remain untouched.

Defining a subroutine

Here's a simple example of a subroutine:

# define subroutine
sub makeJuice
{
print "Making lemon juice...";
}

Every subroutine follows a few basic rules:

  1. The subroutine definition begins with the "sub" keyword, followed by the name of the subroutine. This name is what you use to invoke the subroutine in your scripts. The name may optionally be followed by parentheses.
  2. The code that makes up the subroutine is enclosed within curly braces. This code is regular Perl code—you can use variables, loops, conditionals and all the usual Perl constructs inside it.
  3. Subroutines may appear anywhere in a Perl script, or may even be imported from external files.

To call a subroutine, invoke it by preceding its name with an ampersand (&):

# call subroutine
&makeJuice();

When the Perl interpreter sees this call, it looks for the subroutine named makeJuice() and executes it. You can invoke the same subroutine as many times as you like. In this particular example, we call the makeJuice() subroutine four times:

#!/usr/bin/perl

# define subroutine
sub makeJuice
{
print "Making lemon juice...\n";
}

# call subroutine
&makeJuice();
&makeJuice();
&makeJuice();
&makeJuice();

The above script produces the following output:

Making lemon juice...
Making lemon juice...
Making lemon juice...
Making lemon juice...

Of course, there will come a time when you're fed up with all that lemon juice and would prefer something different. That's why subroutines can accept arguments, user-defined values passed to the subroutine when it is called and then processed by the code inside that subroutine. Let's see how to pass arguments to a subroutine.

Passing arguments to a Perl subroutine


Suppose I would like to give the makeJuice() subroutine more intelligence by telling it which flavor of juice to make:

&makeJuice("strawberry");

Invoking a subroutine with an argument is the easy part—I still need to write the code that accepts the argument and does something with it:

sub makeJuice
{
# retrieve the argument
my ($flavor) = shift (@_);

# and use it
print "Making $flavor juice...\n";
}

Perl has a somewhat unique way of handling subroutine arguments. All arguments passed to a subroutine are stored in a special @_ array. To retrieve the arguments, you have to look inside the array and extract them.

In the revised subroutine definition above, we use the shift() function to extract the first element of the array—the flavor—and assign it to a variable. This variable is then used in the call to print().

If you don't like the shift() syntax, you can also use "regular" array notation (indexing):

sub makeJuice
{
# retrieve the argument
my ($flavor) = $_[0];

# and use it
print "Making $flavor juice...\n";
}

This next listing is another, slightly more useful example:

#!/bin/perl
# define subroutine to convert between dollars and euros sub convertCurrency {
# get amount in $
my ($usd) = shift (@_);

# specify conversion rate
my $convRate = 0.82;

# print amount in euro
print "USD $usd = ", sprintf("%0.2f", $usd * $convRate), " EUR"; }

# invoke function with custom $ amount
&convertCurrency(100);

Here, the convertCurrency() subroutine performs the conversion between dollars and euros. The amount of USD to be converted is passed to the subroutine as an argument, and the output contains the corresponding amount in euros. Adding arguments to a subroutine thus immediately makes the subroutine more flexible and useful.

Sending back return values from a subroutine

Now, consider this variant on the convertCurrency subroutine from the previous example:

#!/usr/bin/perl

# define subroutine
sub convertCurrency
{
# get amount in $
my ($usd) = shift (@_);

# specify conversion rate
my $convRate = 0.82;

# convert value
$euro = $usd * $convRate;
}

# invoke function with custom $ amount
print &convertCurrency(100);

Even though the subroutine does not print any output, it does return a value, which can be caught and used by the main script. By default, this value is the last expression evaluated by the subroutine.

If you like, you can override this return value by specifying your own with a "return" statement:

#!/usr/bin/perl

# define subroutine to check file status
sub checkFileStatus {
# get file path
my ($file) = $_[0];

# test file status
# return 1 or 0
(-r $file) ? return 1 : return 0;
}

# invoke function with filename
# check return value and print appropriate message
$status = &checkFileStatus('/usr/local/mail.cf');

if ($status == 1)
{
print "File is readable\n";
} else {
print "File is not readable\n";
}

The "-r" test checks if the file is readable, and the subroutine sends back true or false to the caller depending on what it finds. The main script then checks this return value and prints an appropriate message.

You can also write a subroutine that returns an array instead of a scalar value:

#!/usr/bin/perl

# define subroutine to
# split an email address into
# user and domain
sub breakEmailAddress
{
# get address
my ($address) = shift(@_);

# split address on the @ symbol into an array
@components = split('@', $address);

# return array
return @components;

}

# split email address
# print the components
@output = &breakEmailAddress('john@some.domain.com');
print "Username is ", $output[0], "\nDomain is ", $output[1]; [/code]

Here is the output:

Username is john
Domain is some.domain.com

Now let's take a look at how Perl deals with variable scope.

Variable scoping in Perl subroutines


There's one last thing you need to know when dealing with variables in the context of Perl subroutines. Usually, the variables used in a Perl subroutine are global variables—that is, they can be accessed and modified from outside the subroutine as well as inside it:

#!/bin/perl

# define a variable in the main script
$i = 9;

# a subroutine
# that changes the value of a variable
# defined outside
sub setI
{
$i = 18;
}

# check the value of $i
print "before — \$i = $i\n";
# run the subroutine
&setI();
# check the value of $i again
print "after — \$i = $i\n";

If you run this script you should get the following output:

before — $i = 9
after — $i = 18

The code inside the subroutine modifies a variable from outside it. This is probably not what you want and can be a difficult kind of bug to track down. To avoid the numerous pitfalls, use the "my" keyword to mark variables inside a subroutine as private:

#!/bin/perl

# define a variable in the main script
$i = 9;

# a subroutine
# that changes the value of a variable
# defined outside
sub setI
{
my $i;
$i = 18;
}

# check the value of $i
print "before — \$i = $i\n";
# run the subroutine
&setI();
# check the value of $i again
print "after — \$i = $i\n";

A real-world example

Finally, a real-world example to tie it all together. When creating complex scripts, developers often need a way to log the activities of the script as it proceeds. Typically, this log consists of messages or warnings, each marked with a timestamp and written to a text file in a predefined format.

Since a single script might well generate numerous messages, it is inconvenient and unwieldy to keep repeating the "open file-write message-close file" sequence at every stage where a log message is to be generated. Instead, it is far more convenient to incorporate this logging code into a separate subroutine, and call it wherever needed.

That's exactly what the next script does:

#!/bin/perl

# subroutine to write a line to the log
sub writeLog
{
# get message code and text
my $code = $_[0];
my $msg = $_[1];

# map of numeric codes to message types
# update this list as you add message types
my @codeMap = ('', 'INFO', 'CRIT');

# map code to human-readable message type
my $msgType = $codeMap[$code];

# get time
my $ts = localtime();

# create the log string and print it
print LOG "$ts $msgType $msg\n";
}

# main script starts here
# load module
use DBI();

# open log file for writing
open(LOG, ">>debug.log");

# mark the beginning of script execution writeLog(1, '— BEGIN —');

# connect to database
writeLog(1, 'Attempting to connect to database'); my $dbh = DBI->connect("DBI:mysql:database=mydb;host=myhost", "user", "pass");

# check for connection, log status
if (!$dbh) {
writeLog(2, 'Could not connect to database');
writeLog(1, '— END —');
die;
} else {
writeLog(1, 'Connected to database');
}

# execute query
writeLog(1, 'Preparing query');
my $query = "SELECT * FROM receipts WHERE date = '1999-08-17'"; my $sth = $dbh->prepare($query); my $ret = $sth->execute();

# check for query, log status
if ($ret) {
writeLog(1, 'Executing query: ' . $query);
} else {
writeLog(2, 'Could not execute query: ' . $query);
writeLog(1, '— END —');
die;
}

# iterate through resultset
writeLog(1, 'Attempting to process result set');

# log each iteration
while(my $ref = $sth->fetchrow_hashref()) {
writeLog(1, 'Got row');
print "Customer: $ref->{'name'}\nDate: $ref->{'rdate'}\nAmount: $ref->{'amt'}\n\n";
}

# clean up
writeLog(1, 'Attempting to disconnect from database');
$dbh->disconnect();

# check for disconnection, log status
writeLog(1, 'Disconnected from database');

# close log file
# mark end of script execution
writeLog(1, '— END —');
close (LOG);

Stripped to its core, this script is very simple—all it does is connect to a database, run a query, and spit out the result. However, in order to keep track of what's happening at each stage, a custom writeLog() function is called at important points to record the progress of the script.

The writeLog() subroutine takes two arguments: a string message and a numeric code indicating the severity of the message (1 for notifications, 2 for errors). These arguments are then combined with a timestamp to create a single string, which is then written to the log file. Here's a sample of what the log might look like:

Editor's Picks

Free Newsletters, In your Inbox