Enterprise Software

Counting your clients' cookies for Apache user management

When cookies are working correctly, we never think about them. When they're not, we worry. But there's middle ground: As this article explains, keeping an eye on our cookies can help enhance management of our user community.

Cookies are simple text messages, either kept in client machine memory or written as a file on client machine hard storage. The cookie is used to store state information about the client-server transaction, since the protocol (HTTP) is stateless. While cookies are a real convenience, they can also be a security risk. However, looking beyond the security issues cookies can also be put to work. You can monitor the traffic of Web sites on your Apache server, learn what features are being used on their Web pages (useful for site traffic analysis), and bolster server security by identifying users and tracking their activity.

Configuring Apache for cookie tracking
Before exploiting cookies for these sophisticated features, it’s important to know how to set Apache up for cookie usage in the first place. Apache comes with a module that generates cookies and sends them to clients. This module, mod_usertrack, must be loaded when Apache is configured.

In the configuration file httpd.conf, the code to load the cookie module comes prewritten (but is usually commented out). To enable the code to load the module upon configuration, simply un-comment these lines:
LoadModuleusertrack_module modules/mod_usertrack.so

Additional configuration is required once you have enabled cookie generation. You must configure the cookie itself and set up its logging. As cookies are logged client by client, it becomes possible to detail-log the client’s site activity. The log file then becomes a security and analysis tool.

What kind of cookie?
The following lines in the httpd.conf file set the cookie format and logging conditions. By default, Apache names its cookie Apache. If you omit a cookie name, this is assumed. However, you can put in the cookie name of your choice, as follows:
               CookieName          MyCookie

The format of the cookie is also optional. Apache defaults to Netscape cookie style:
               CookieStyle          Netscape

You can override this default, if you like. However, there are "standards" for cookies out there beyond Netscape’s own format. In fact, Netscape initiated these standards in RFC 2109 and RFC 2965. To override the default, use the following style names:

For cookie format RFC 2109, enter the following:
               CookieStyle          Cookie

And for cookie format RFC 2965, enter this:
               CookieStyle          Cookie2

Cookie duration
How long do you want the cookies to be active? This is decided at configuration time, and each cookie header will contain an expiration date based upon this configuration value. Apache will default to a per-session cookie (stored in memory and eliminated upon session termination) if you enter no expiration value. If you enter a numerical value, Apache reads it as number-of-seconds (in this example, an hour):
               CookieExpires        3600

Alternately, you can enter a time period in quotes:
               CookieExpires        "3 days"

Counting the cookies
Once you’ve enabled cookies and set the name, format and expiration, you’re ready for logging. When cookie tracking is enabled, Apache will generate a cookie upon receiving a new request. To enable tracking in the httpd.conf file, the following line is required:
               CookieTracking             On

When you track cookies, you’re tracking activity in a particular domain. Remember that the cookie that resides on the client machine usually includes the relevant domain and path for the server interaction. This is where you enter that domain name, which will then become part of the client-stored cookie. You don’t have to use this, and the default is to have an empty spot in the header of the outbound cookie. If you don’t enter a domain, however, then you won’t be able to specify the domain a client group is accessing.
               CookieDomain         .userdomain.com

Where will the tracking data go?
You can set up a log file at time of configuration with a line in httpd.conf, as follows:
               CustomLog  logs/clickstream “{cookie}n %r %t”

In the line above, logs/clickstream specifies the logs directory and the clickstream subdirectory, where the cookie log file will reside.

Follow the data
Once you’ve generated this log, entries are posted to it with every client request. You can track a great deal with such a log. Here are some possibilities:
  • Keep track of all users by frequency-of-visit, i.e., once a day, twice a week, etc.
  • Track all users by length-of-stay, i.e., how much time they spend on a Web site.
  • Match recent users against a list of all past users.
  • Track individual user activity by click activity, i.e., pages and links accessed (useful for Web site analysis).
  • Track individual user time spent on specific Web pages.
  • Track collective user activity by clicks for specific time periods, i.e., which links/buttons/features received the most use in a given day, week, etc.
  • Track collective user activity by page access, i.e., which Web pages were viewed most often—and for average length of time—in a given period.

You’re probably able to think of other ways in which you might make use of this information, but this is a start. You can know where users are poking around, which users visit casually, see when a user is persistently on a Web site and whether or not the usage patterns reflect normal use, and so on. The log file is easily manipulated; constructing utilities to extract specific information for security or analysis purposes is a simple exercise.

Next steps
It’s possible to go even further using cookies to enhance your Apache service. For instance, tracking user preferences and applications becomes possible. However, profiling users in this way invokes privacy issues. How these functions are implemented and how the privacy question can be addressed will be discussed in an upcoming article.

About Scott Robinson

Scott Robinson is a 20-year IT veteran with extensive experience in business intelligence and systems integration. An enterprise architect with a background in social psychology, he frequently consults and lectures on analytics, business intelligence...

Editor's Picks