Software

All the wonders of procmail, part 1

The procmail mail processing agent is very powerful but often confusing. Visit this powerful application with Jack Wallen, Jr. as he helps you understand procmail's sometimes archaic language.


Imagine being able to configure your mail-processing agent on your own personal computer to do whatever you want it to do! Imagine being able to configure, with a simple file edit, autoreplying, autoforwarding, playing specific sounds, filtering, and many other actions. Now imagine which application can do this. And finally, imagine which operating system you must be using to have these awesome features!

If you're not familiar with procmail, you don't know what you're missing. procmail works in conjunction with your local Mail Transport Agent (MTA) as a local mail-delivery agent and a powerful filtering system. Say what? Okay, let me try to explain this one more time. Linux has various MTAs (sendmail, fetchmail) that go out, poll your configured mail server (or your ISP's mail server), and bring your e-mail to your computer. Then the MDA (Mail Delivery Agent) takes the mail from where the MTA leaves off and drops it into your user-defined mailbox. (Imagine that the U.S. Postal Service is your MTA, your postal carrier is your MDA, and your postal carrier has permission to sort through your mail and remove all the junk you don't want.)

procmail does what it does (delivers your e-mail to your mailbox and filters out the unwanted) with such amazing flexibility that you could spend the rest of your computing life toying with its options and configurations. In this Daily Drill Down, we'll highlight some of the more useful options and help you create a highly personal and useable procmailrecipe.

Recipe what?
First of all, procmail uses an rc file (resource configuration) called .procmailrc that resides in either the user’s home directory or the /etc directory. If there is no .procmailrc file in the user’s home directory, the application will default to the file in /etc.

Note
Use caution when writing the global default file because it will be executed with root privileges.

We’ll begin with the single-user procmail and eventually migrate to a more complex procmail system that will aid the administrator with such tasks as redirection of e-mail, notification of e-mail, autoresponding, and filtering.

The .procmailrc file is made up of various recipes that pull off all the stunts that procmail is capable of achieving. Some of these recipes are simple, and some of them are complex. Regardless of the complexity of the recipe, there are commonalities throughout all of procmail that we’ll outline.

Writing procmail recipes
procmail has two kinds of recipes: delivering and nondelivering. If a delivering recipe is found to match, procmail considers the mail delivered and will cease processing the .procmailrc file after having successfully executed that line of the recipe. If a nondelivering recipe is found to match, processing of the .procmailrc file will continue after the action line of this recipe has been executed. In other words, a delivering recipe ends the action on an e-mail, and a nondelivering recipe allows the action to continue.

The .procmailrc file is written in sendmail.cf syntax, so you can imagine the vastness of its capabilities. Within this section, I will enlighten you on some of the syntax necessary for procmail recipes.

I like to look at the philosophy of creating procmail recipes in a very simple way: How do I want to act upon an incoming e-mail, and what key section of that e-mail will allow me to accomplish my goal? For this, you’ll want to look at an e-mail containing the following sections:
  • ·            Destination header (To, Cc, Bcc, etc.)
  • ·           From header
  • ·           Date header
  • ·           Subject header
  • ·           Body

Finally, you will want to look at procmail as having these basic actions:
  • ·            Directing
  • ·            Redirecting
  • ·           Filtering
  • ·            Notification

Using expressions
procmail uses standard egrep expressions. A regular expression is a pattern that describes a set of strings (a string is a sequence of characters) that are constructed similarly to arithmetic expressions. Regular expressions typically consist of smaller, singular expressions that use various operators to piece them together. procmail uses regular expressions to do such things as search a subject line for a string.

Let's use a very common example—spam. We all hate it, and we all want to do away with it. Within a GUI e-mail client, you can include keywords and keyphrases for the mail filters to search for and act upon. procmail does this exact thing by using expressions. For instance, you frequently receive spam with the subject “Make Money From Home.” How do you filter this so that such a message will never appear in your inbox? With regular expressions, of course. Let's look at a small section of code used to filter the subject line “Make Money From Home.” The following section of code is where procmail scans the subject (another bit of code is used to act upon the mail should the subject match):
* ^Subject:.Make Money From Home$

Within the above snippet (from a .procmailrc file), we used a number of operators to put together a regular expression:
  • ·          ^ Marks the beginning of a line.
  • ·          . Means any following character.
  • ·          $ Denotes the end of a line.
  • ·          ? The preceding item is optional and will be matched at most once.
  • ·          * The preceding item will be matched zero or more times.
  • ·          + The preceding item will be matched one or more times.
  • ·          {n} The preceding item will be matched exactly n times.
  • ·          {n,} The preceding item is matched n or more times.
  • ·                 {n,m} The preceding item is matched at least n times, but not more than m times.

So you could look at the above snippet of code as two expressions pieced together, with operators, to form one phrase:
  • ·           Subject:
  • ·           Make Money From Home

You could take this one step further and filter out any mail with a subject that contains any one of the words "Make Money From Home" with the following change:
* ^Subject:\.Make|Money|From|Home$

We’ve made the following additions:
  • ·          \ Tells the condition to match exactly what follows as opposed to any that follow.
  • ·          | Can act as a standard Linux pipe symbol or as an or extraction operator (as in this or that or these or this | that | these).

Also, procmail can use the following:
  • ·          ! Forward to the specified e-mail address. (Be careful, because this can also mean invert the condition, depending on where and how it's used.)
  • ·          a* Any sequence of zero or more a
  • ·          a+ Any sequence of one or more a
  • ·          a? Either zero or one a
  • ·                [^-a-d] Any character that’s not a dash, a, b, c, d or newline
  • ·                 (abc)* Zero or more times the sequence abc

Note
There are, of course, much more complicated (as well as powerful) means of filtering spam. Also note that the above expression will filter any message that has any combination of the words Make, Money, From, and Home within the subject.

Keywords and special characters
Like most applications of this nature, procmail uses keywords and special characters to make life easier. Many of these keywords and characters should be obvious to catch as well as to use. Some examples are:
  • ·          :0 Beginning of a recipe
  • ·          :0: Beginning of a recipe that uses a lockfile (more on using lockfiles later)
  • ·          To Compare what’s in the e-mail’s To: field to the condition
  • ·          From Same as above but comparing the From: field
  • ·          Subject Same as above but comparing the Subject: field
  • ·          \/ Anything after this set of characters (a \ followed by a / —not a capital v [V]) will be contained in the variable/keyword $MATCH. And $MATCH can now be used to create a filename (or whatever).
  • ·           ^TO_ A special procmail expression designed to catch an e-mail address (or the beginning of an e-mail address) that is in any destination header (To, Cc, Resent-To, etc.)

Of course, there are a great many other keywords and metacharacters involved with procmail, but the above list will get us started.

The header
I like to think of the first section of the procmailrc file as the header. This section of the file sets up the location of the user’s mailbox and the log file, as well as files set up for debugging purposes.

This header section contains (but is not limited to) the following lines:
SHELL=/bin/sh
MAILDIR=$HOME/mail
DEFAULT=$MAILDIR/username
LOGFILE=$MAILDIR/log
##LOGABSTRACT=all
VERBOSE=no


These are the main entries for the header section:
  • ·           SHELL=/bin/sh Sets your shell (in this case to bash) to avoid unnecessary problems.
  • ·           MAILDIR=$HOME/mail Creates the default directory in which your mail will be stored. Typically, (but not necessarily), this is /home/USERNAME/mail.
  • ·           DEFAULT=$MAILDIR/mail Sets the default inbox.
  • ·           LOGFILE=$PMDIR/log Sets the default location of the procmail log. This is typically set within the actual procmail directory.
  • ·           ##LOGABSTRACT=all This is for debugging purposes. Remove the ## when you need more information for debugging.
  • ·           VERBOSE=no Again, this is for debugging. Set this option to yes when you need to debug a recipe.

Other useful environment variables for the header section include:
  • ·           ORGMAIL=/path/to/system/mailbox Usually the system mailbox—if, for some reason, the mail could not be delivered, then this mailbox will be the last resort.
  • ·           LOCKFILE=/path/to/global/lockfile Global semaphore file (The use of a global lockfile is discouraged; whenever possible, use local lockfiles on a per-recipe basis instead.)
  • ·           TIMEOUT=x Number of seconds (x) that must pass before procmail decides that some child it started must be hanging. If procmail decides that a process has hung, then that offending process will receive a TERMINATE signal from procmail, and processing of the rc file will continue. If the number is zero, then no timeout will be used, and procmail will wait forever until the child has terminated. The default is 960 seconds.
  • ·          HOST Hostname of the machine
  • ·           NORERETRY Number of retries that are to be made if any process table full, file table full, out of memory, or out of swap space error should occur. If this number is negative, then procmail will retry indefinitely; if it is not specified, it defaults to four times.
  • ·           SUSPEND=x Number of seconds (x) that procmail will pause if it has to wait for something that is currently unavailable (memory, fork, etc.)—if it is not specified, it will default to 16 seconds.
  • ·          MATCH This variable is assigned by procmail whenever it is told to extract text from a matching regular expression. It will contain all text matching the regular expression that follows the \/ characters.
  • ·           INCLUDERC=/path/to/rc/file Names an rc file (relative to the current directory) that will be included here as if it were part of the current rc file.

The recipes
Now we get into the heart of the matter—the primary force behind procmail. The onions and potatoes (sorry, I'm a vegetarian, so no meat here), so to speak.

A procmailrecipe is just what it sounds like—a recipe that is cooked up that will act upon an incoming e-mail. As I said before, there are delivering and nondelivering types of recipes, and here I'm going to help you understand delivering recipes (in different forms). What we will do is start piecing together a very simple recipe and graduate to more advanced recipes as we go.

Recipes—the basics
The basic procmail recipe looks like this:
:0
action


Adding in a few more possibilities gives you
:0 flags: lockfile_name
* condition 1
* condition 2
action


The first recipe has two lines. The first line, :0, marks the beginning of the recipe (all recipes will have this), and the second line, action, is the action that is to be taken on incoming mail. For instance, you can replace action with something as simple as
! xxx@xxxx.xxx

which will forward all mail to the indicated e-mail address (where xxx@xxxx.xxx is replaced with a legitimate e-mail address).

The second recipe contains two (or more) extra lines that are marked condition1 and condition2. The second recipe also adds to the first line both flags and a user-defined lockfile. A lockfile is very important within procmail, but you should understand it before using it. (We'll discuss lockfiles later.) The list of possible flags used in procmail is extensive and can be found within the procmailrc man page (at a prompt, type man procmailrc).

The conditions of a procmail recipe are all user-defined and are limited only by your imagination. You’ll spend the greatest amount of time and energy—as well as patience—with this section. A condition can be thought of as “if the e-mail meets these requirements, then pass it along to the following action statements.” Conditions can be stacked to an unlimited depth, which makes them both very powerful and very confusing.

Delivering recipes
The first recipes we’re going to look at are delivering recipes. Delivering recipes are those that cause the header and/or body of the mail to be written into a file, absorbed by a program, or forwarded to a mail address.

Before we actually begin cooking up our recipes, let's first write a safety net into our $HOME/.procmailrc file. Inserting the following two recipes above all other recipes will ensure that of all arriving mail, the last 32 messages will be preserved. In order for it to work as intended, you must create a directory named backup in your $MAILDIR before writing these recipes into your procmail's rc file. Create this directory with the following:
mkdir ~/mail/backup

(assuming that ~/mail is your user mail directory), and you are good to go.

With the backup directory in place, you can now create the first recipe in your ~/.procmailrc file. Open the rc file in your favorite editor and enter the following two recipes after the header section:
:0 c
backup

:0 ic
| cd backup && rm -f dummy `ls -t msg.* | sed -e 1,32d`


So you know what the above command is doing, let's break it down.

First recipe:
  • ·          :0 c Begin the recipe and make a carbon copy of all mail.
  • ·          backup Save all copies to the backup directory.

Second recipe:
  • :0 ic Begin the recipe, ignore any write errors on this recipe, and make a carbon copy of all mail.
  • | cd backup && rm -f dummy `ls -t msg.* | sed -e 1,32d` Change to the backup directory (and if the cd command works), remove any dummy files, list all files beginning with msg. (in order of modification times), pipe the listed files to the sed command, and delete all but 1 through 32.

With these backup recipes in place, you’re ready to begin experimenting with your first recipes.

Redirecting e-mail
The first of the delivering recipes we are going to look at is an obvious one, I hope. This recipe simply receives an e-mail and then bounces it to another address (deleting the original address from the receiving account). This can be accomplished with (what was outlined above)
:0
! xxxxxx@xxxx.xxx


which will then redirect all e-mail to the address defined in the xxxxxx@xxxx.xxx space. The above recipe consists of only an action (which is to redirect all e-mail). We'll build on this recipe by adding a condition that will examine a specific portion of the e-mail and, if the condition is met, act upon it.

For our example, let's say that you want every e-mail sent to you from the e-mail address fiancee@herhome.com to be redirected to a personal account, fiance@myhome.com. The first thing you have to do is check any incoming mail for an address match with the following condition:
* ^FROM.*fiancee@herhome.com*

With the above condition in place, only those e-mails coming from the specified address will be redirected. Of course, like almost every Linux application and command, this can be modified to suit many needs. Say, for instance, that you wish to send all e-mail coming from a certain domain (we'll use thisdomain.com) to another address. To do this, you will alter the above condition to
* ^FROM\/.+@thisdomain.com*

which will redirect any e-mail coming in from thisdomain.com to the configured target address. Notice the difference between the two conditions: The first contains the entire e-mail address you wish to redirect, and the second contains only the @ and the domain name. What is critical within the new condition is the addition of \/.+, which says that the exact string will be matched one or more times.

Note
Make sure that you are using the \ followed by the / and not a capital v (V) in the above condition.

One step further
Okay, so you have your basic redirecting recipe that takes all e-mail from a certain source and sends it on its way to another. Let's say that you want to keep a copy for yourself. How do you do it? Very simple.

To keep a copy of the redirected e-mail, you need only add on to the current working recipe. The addition can be as simple as writing another action section; however, we are going to make the current recipe a bit more standard and stable.

Remember from above that we have only two lines. The first line indicated that a new recipe was beginning, and the second was the actual recipe. The problem with this recipe is that it is very nonspecific. We don't like that. What we want to do is make our recipes as specific as we can, and we can do this by using the tools we already have.

Instead of creating a recipe that says (basically) “take all incoming mail (regardless of address) and send it to X,” we want to say something more specific like “take all mail addressed to X and send it to Y.” This method will certainly take you further in your trek with procmail. So, let's implement it!

The first section of our pseudo code, “take all mail addressed to X,” can be handled with the following condition (using me@mailaddy.com for the incoming e-mail address):
* ^TO.*me@mailaddy.com

The above condition is fairly complete, but we’re not finished yet. Remember that we have to begin this with the :0 characters, so now our new recipe, so far, looks like this:
:0
* ^T0.*me@mailaddy.com


Now you're one step closer to a better recipe. The next step is to add the action lines that will mail a copy of the e-mail to another address (we’re going to use you@mailaddy.com) and keep a copy of it in the local default mailbox. The first thing we’ll do is create the two action sections for this recipe, and then we'll piece them together.

The first section is the remailing section, and it looks like this:
:0 c
! you@mailaddy.com


What will be unfamiliar in the above section is the c flag in the first line. The c flag tells procmail that we’re making a carbon copy of the e-mail. Also possibly unfamiliar (although mentioned earlier) is the ! character, which can hold two different meanings. Within a condition, the ! character means to invert the condition (or not). Within an action line, the ! character means to forward to the specified e-mail address.

Before we add the new section, let's put our improved recipe together:
:0
* ^TO.*me@mailaddy.com
:0 c
! you@mailaddy.com


Now you have a recipe worthy of building upon.

The final step in this recipe is to add in the section that will send a copy of the mail to the default inbox. To do this, however, we have to know how to enclose both of our new action statements within braces, {}, so that procmail knows that both action statements are to be used within the same recipe.

The section we’re now going to add is very simple:
:0
/path/to/mail/directory


Please keep in mind that, in the above section, you’ll be using your own user mailbox (we'll say that the username is Haversham). Say, for instance, that your user inbox is located in /home/Haversham/mail/Haversham and is defined in the header section of the .procmailrc file. If so, then all you’ll have to write in the new section of our current recipe is
:0
Haversham


and procmail will know that this is the local default mailbox for this user.

Now, putting the pieces together, we have a complete recipe that looks like this:
:0
* ^TO.*me@mailaddy.com
{
 ! you@mailaddy.com
 :0
 Haversham
}


Another step further
Let's add more to this recipe so that we know when a new mail arrives and is delivered by our current recipe. How? The first thing we must do is add some standard flags to the first :0 line. We will use the following flags:
  • ·          i Ignore any write errors on this recipe.
  • ·          c Generate a carbon copy of the e-mail.
  • ·          h Feed the header to the pipe, file, or mail destination (the default).

We now add an action line to the condition that plays the sound. This new action line looks like
| play ~/directory/to/sound/file

and is inserted right under the condition.

Our newer, tastier recipe now reads like this:
:0ich
* ^TO.*me@mailaddy.com
| play ~/path/to/sound/file
{
 ! you@mailaddy.com
 :0
 Haversham
}


Note
In the new action line, you will want to replace the play command with whatever application your system uses to play sounds, and you will want to explicitly configure the complete path to the sound file you wish to use.

Absorbing e-mail
Although not technically a program, the recipe we’re now going to create deals with adding spammers to a blacklist (filename: black.lst). This recipe compares incoming headers to the blacklist file and, if any are found, it sends them to /dev/null (effectively discarding the e-mail).

This recipe covers all common sender headers as well as the To: field (since many spammers use phony addresses). This recipe is also a bit more complex than what we've used so far. Within this recipe we’re going to use egrep to search the blacklist file for spammers and use formail to extract the contents of the header file.

The first step is to create the black.lst file, which will be placed in your default mail directory (~/mail) and will simply be a list of suspect e-mail addresses like these:
spam1@spamaddy.com
spam2@spamaddy.com
spam3@spamaddy.com


Any time one of these e-mail addresses is matched in the header of an incoming e-mail, the e-mail will be then dumped into /dev/null and therefore gone forever.

The recipe looks like this:
:0
* ? formail -x"From" -x"From:" -x"Sender:" \
 -x"Reply-To:" -x"Return-Path:" -x"To:" \
 | egrep -is -f black.lst
/dev/null


The recipe is broken down like so:
  • :0 Begin the recipe (no lockfile used).
  • ? formail -x"From" -x"From:" -x"Sender:"\
  •  -x"Reply-To:" -x"Return-Path:" -x"To:"\
  •  | egrep -is -f black.lst This code is actually all one line. It reads something like this: “The following items will be matched zero or more times (but are all optional). Use the formail program to extract any of the following: From, From:, Sender:, Reply-To:, Return-Path:, To:, and use the egrep application (while ignoring case and suppressing error messages) to compare it to the black.lst file.”
  • /dev/null Send any addresses (that are found to match addresses in the black.lst file) to /dev/null, effectively erasing them.

The beauty of this recipe is that it allows you to simply add new spam addresses into a single file so that only one recipe is necessary. This is much more efficient than creating multiple recipes for any and all spammers.

Note
Do not use the above recipe without having the initial backup system in place. Otherwise you run the risk of losing critical (or personal) e-mail.

Conclusion
In this first Daily Drill Down of this series, we've introduced procmail and worked up some fairly useful delivering recipes. You’re well on your way to understanding the true power of this tool.

The next time we visit procmail, we’re going to handle some more complicated nondelivering recipes as well as introduce and explain lockfiles. With the completion of the second Daily Drill Down of our procmail series, you should be confident in your procmail prowess. In the meantime, get cookin'!

About Jack Wallen

Jack Wallen is an award-winning writer for TechRepublic and Linux.com. He’s an avid promoter of open source and the voice of The Android Expert. For more news about Jack Wallen, visit his website jackwallen.com.

Editor's Picks

Free Newsletters, In your Inbox