Developer

Session management in Perl

Follow these step-by-step instructions on creating user sessions in Perl.


You've probably noticed that your favorite shopping sites always seem to know who you are each time you visit and that each page of CD and book recommendations has a funny URL with an incomprehensible string of digits. You've probably also realized that each click you make adds to a profile of what you look at and are interested in and that the shopping site distinguishes your server requests from everyone else's.

While it might take heavy programming to turn a purchase history into a recommendation, it is pretty simple to associate each request to your Web server with an individual user. The HTTP protocol itself is no help because it was designed to be stateless, requiring no information from one request to the next. So without something extra, a server cannot know which requests come from the same person. In this article, we'll look at how to associate multiple requests with a single session and how to store information specific to that session on the server.

The most common session management tool in Perl is the Apache::Session package, available on CPAN. This package provides a very easy interface to persistent data sets associated with sessions and has many data storage options. Apache::Session also generates session IDs for you, unique identifiers for associating different HTTP requests with a particular browser session. The session IDs that Apache::Session generate are 32-character MD5 hashes, for example
3538d6f50b74ceafa68aa734ea21646f

These session IDs are assured to be unique, which is necessary to prevent two browsers from ever appearing with the same session ID.

Cookie-based sessioning
The easiest way to assign a browser a session ID is with HTTP cookies, which store small bits of data on the remote browser. With each request that's sent back to your server, the browser will return the session ID that it was originally assigned. Let's look at a simple CGI example of assigning a session ID.
#!/usr/bin/perl

use CGI;
use Apache::Session::File;

my $query = new CGI;
my %session;
my $id = undef;

$id = $query->cookie(-name=>"SID01");

tie %session, 'Apache::Session::File', $id,
      { Directory => "/tmp/",
      LockDirectory => "/tmp/"};

if ($id == undef) {
   $cookie = $query->cookie( -name=>'SID01',
         -value=>$session{_session_id},
         -expires=>'+1y',
         -path=>'/session');
   print $query->header(-cookie=>$cookie);
   print "Assigned session ID<br>n";
} else {
   print $query->header();
   print "Not assigned session ID<br>n";
};

$id = $session{_session_id};

print "<html>n";
print " <head><title>Session ID</title></head>n";
print " <body bgcolor=#ffffff>n";
print " Your session ID is $idn";
print " </body>n";
print "</html>n";


As you can see, most of the code deals with assigning the session ID as a cookie named SID01 in the HTTP header. When you load the CGI script for the first time, the browser won't pass a session ID to the HTTP server, so the value of $id will still be undefined. Apache::Session, instantiated in the next line, takes a session ID as an argument. If the value is undefined, it knows that it must generate a new session ID; if a value is provided, it locates the data store associated with that session. If a new session ID is generated, the server returns it as a cookie in the outgoing HTTP header. Apache::Session always provides you with the session ID if you query the hash with the key session_id.

Load this CGI script a few times, and you'll see that you always get the same session ID. You'll continue to use the same session ID until the cookie expires, which it is set to do one year after it is issued.

Storing session data
The Apache::Session is a tied hash, so anytime you manipulate the hash, the result is stored in the permanent data store, making it incredibly easy to use. Let's look at a simple example of a CGI script that collects form data. In this case, we want to show users the form values from the last time they submitted. We'll collect a name, an address, and a birthday. You'll see that the session ID and form values are the same every time you return to the page. We store the values on the server, updating them when the user submits new data.
use CGI;
use Apache::Session::File;

my $query = new CGI;
my %session;
my $id = undef;

$id = $query->cookie(-name=>"SID01");

tie %session, 'Apache::Session::File', $id,
      { Directory => "/tmp/",
      LockDirectory => "/tmp/"};

if ($id == undef) {
   $cookie = $query->cookie( -name=>'SID01',
         -value=>$session{_session_id},
         -expires=>'+1y',
         -path=>'/session');
   print $query->header(-cookie=>$cookie);
   print "Assigned session ID<br>n";
} else {
   print $query->header();
   print "Not assigned session ID<br>n";
};

if ($query->param()) {
   $session{"name"} = $query->param("name");
   $session{"address"} = $query->param("address");
   $session{"birthday"} = $query->param("birthday");
}

print "<html>n";
print " <head><title>Session info</title></head>n";
print " <body bgcolor=#ffffff>n";
print " <form action='/session/userinfo.cgi' method=post>n";
print " <b>Name: </b>";
print " <input type=text size=12 name='name' value='".$session{'name'}."'><br>n";
print " <b>Address: </b>";
print " <input type=text size=12 name='address' value='".$session{'address'}."'><br>n";
print " <b>Birthday: </b>";
print " <input type=text size=12 name='birthday' value='".$session{'birthday'}."'><br>n";
print " <input type=submit></form>n";
print " Your session ID is $idn";
print " </body>n";
print "</html>n";


This is a somewhat contrived example, but there are some very practical uses that you can employ on your Web site. You might find it helpful to store the date of the last visit so that you can show a list of things that have changed on your site since then. Or you could store the last five pages viewed to create a quick index to them. More cleverly, you could keep a user's preferences and dynamically customize the content of your site based on them, without requiring the user to do anything more than return to your Web site.

One advantage that Apache::Session has, besides its ease of use, is that its data persists in a common data store no matter which CGI script, mod_perl module, or Mason component calls it. So you can use the information that you store in all of your Perl code: a name stored in a CGI script can be recalled by a mod_perl module, as long as you can provide the same session ID.

Automatic session management
You can ensure that every request to your Apache Web server is assigned a session ID by creating a mod_perl module that does it for you. This module will place a session ID cookie into any HTTP header that doesn't have one.
package Apache::CookieMonster;

use Apache::File;
use MD5;
use CGI::Cookie;
use Apache::Session::File;

sub handler() {
   my $r = shift();
   my $id = undef;

   my $input = $r->header_in("Cookie");
   $input =~ s/SID02=(w*)/$1/;
   $id = $input;

   tie %session, 'Apache::Session::File', $id,
      { Directory => "/tmp/",
         LockDirectory => "/tmp/"};

   if ($id == undef) {
      $cookie = new CGI::Cookie( -name=>'SID02',
         -value=>$session{_session_id},
         -expires=>'+1y',
         -path=>'/');
      $r->header_out("Set-Cookie" => $cookie);
   }
   return DECLINED;
}

1;


All you have to do is place this package into your Perl interpreter's path, then add the following to your Apache httpd.conf:
PerlHeaderParserHandler Apache::CookieMonster
PerlModule Apache::CookieMonster


This tells Apache to let Perl intercept all HTTP headers.

Storage and further development
Once you have a good grasp of what to do with the session manager, you should consider your storage options. In the preceding examples we used a flat-file storage method. However, that might not be the best choice for you, and you'll certainly want to find a more permanent place for your session files than the /tmp directory.

How you store the data depends on what your needs are. Some applications can store session data in flat files, while others will need the power of a relational database. That's why Apache::Session is designed to support multiple storage mediums. It ships with DB_File, Oracle, MySQL, Postgres, and Sybase modules, and you can create your own. Each has its own configuration needs and documentation but works similarly to the File storage in our examples. Keep in mind that using File more or less ties you to one machine, limiting scalability. If you use some sort of load-balancing method, you'll need each machine to keep the session IDs consistent, or you'll end up with confusion when users get a different ID from whichever machine serves them.

Note also that no technique presented here is foolproof. The session management we covered associates requests with a single browser but not necessarily with a single user. If you offer a registration system, however, you can send users their unique session IDs when they log in, effectively letting them resume their sessions from anywhere.

Editor's Picks

Free Newsletters, In your Inbox