Software

All the wonders of procmail, part 2: Lockfiles and nondelivering recipes

The procmail system of delivering e-mail is a powerful and oftentimes confusing one. From the use of lockfiles to the complex nondelivering recipes, Jack Wallen, Jr. clears the air for this powerful Linux system.


When last we left our intrepid procmail, we had discovered the wonders of delivering recipes. This time around, we're going to take a look at the more advanced nondelivering recipes as well as lockfiles. As we said in the last Daily Drill Down, a nondelivering recipe allows further action upon an incoming e-mail. You don't want to necessarily think of a nondelivering recipe in such a way that it can't (or won't) deliver an e-mail. A nondelivering recipe can, in fact, deliver an e-mail to a file, a directory, or even another e-mail address.

Nondelivering recipes are those that cause the output of a program or filter to be captured back by procmail. In other words, a nondelivering recipe can use an external application (one outside of procmail itself) and act upon that e-mail before the processing on the e-mail is considered complete.

In order to accomplish nondelivering recipes, a lockfile is often used. What is a lockfile? I'm glad you asked! In this Daily Drill Down, we’ll unravel the mystery that the Linux (and the UNIX) community calls lockfiles.

Lockfiles
It’s important to understand why lockfiles are used with certain recipes and why they’re not used with others. Understanding this crucial difference could keep you from losing a good deal of e-mail (as well as a good deal of time in retrieving that lost e-mail).

What do they do?
Very simply, a lockfile places a semaphore within a file that basically says, “the file that owns this lock already has a process acting upon it and cannot be acted upon by any other process.” With this handy-dandy tool, you can take an incoming e-mail, attach a lockfile to it, run a single process on the e-mail, and be sure that no other process will attempt to execute on that e-mail until the locked process has come to completion.

How are they used?
In practice, we can use them for basic virus protection. Let's say that we have an outbreak of viruses that come in the form of .vb scripts. A simple way to avert this virus is to mangle the extension so that Windows (poor, poor Windows) cannot execute the suspect file. Once the extension is mangled, the e-mail can then be delivered to its rightful owner.

Without the lockfile, it’s possible that the e-mail could be delivered (through another recipe) before the extension is mangled. With the lockfile, this won’t happen—unless the recipe is poorly written, in which case no lockfile can help you!

The fastest way to pinpoint whether a recipe uses a lockfile or not is the trailing : after the :0 at the top of a recipe. There are other times when a lockfile is explicitly called and named by appending a name, such as :0: vacation.lock. The advantage, of course, is efficiency. We'll discuss lockfiles more as we go along.

Review the basics, quickly
If you remember, the basics of a procmail recipe look like this:
:0
action


Adding in a few more possibilities gives you this:
:0 flags: lockfile_name
* condition 1
* condition 2
action


The first recipe has two lines. The first line, :0, marks the beginning of the recipe (all recipes have this), and the second line, action, is the action that is to be taken on incoming mail. For instance, you can replace action with something as simple as
! xxx@xxxx.xxx

where xxx@xxxx.xxx is replaced with a legitimate e-mail address. This will forward all mail to the indicated e-mail address.

The second recipe contains two (or more) extra lines that are marked condition1 and condition2. The second recipe adds to the first line both flags and a user-defined lockfile. The possible flags used in procmail are fairly valuable:
  • H egrep The header (the default)
  • B egrep The body
  • D Tells the internal egrep to distinguish between uppercase and lowercase (contrary to the default, which is to ignore case).
  • A This recipe will not be executed unless the conditions on the immediately preceding recipe (on the current block-nesting level) without the A or a flag are matched as well. This allows you to chain actions that depend on a common condition.
  • a This has the same meaning as the A flag, with the additional condition that the immediately preceding recipe must have been successfully completed before this recipe is executed.
  • E This recipe executes only if the immediately preceding recipe was not executed. Execution of this recipe also disables any immediately following recipes with the E flag. This allows you to specify else if actions.
  • e This recipe executes only if the immediately preceding recipe failed (i.e., the action line was attempted but resulted in an error).
  • h Feed the header to the pipe, file, or mail destination (the default)
  • b Feed the body to the pipe, file, or mail destination (the default)
  • f Consider the pipe as a filter.
  • c Generate a carbon copy of this mail. This makes sense only on delivering recipes. The only nondelivering recipe this flag has an effect on is a nesting block. In order to generate a carbon copy, this will clone the running procmail process (lockfiles will not be inherited), whereby the clone will proceed as usual and the parent will jump across the block.
  • w Wait for the filter or program to finish and check its exitcode, which is normally ignored. (If the filter is unsuccessful, then the text will not have been filtered.)
  • W Has the same meaning as the w flag, but will suppress any “Program failure” message.
  • i Ignore any write errors on this recipe (usually due to an early closed pipe).
  • r Raw mode—do not try to ensure that the mail ends with an empty line. Write it out as is.

Many of the above flags are passed via a recipe—and often in the opening statement, such as
:0 A:

which says, “Begin a recipe, but process this recipe only if the previous recipe was successful, and create a lockfile for this process.” (I just had to throw in that lockfile.)

The conditions of a procmail recipe are all user-defined and are limited only by your imagination. You will spend the greatest amount of time and energy—as well as patience—in this section. A condition can be thought this way: “If the e-mail meets these requirements, then pass it along to the following action statements.” Conditions can be stacked to an unlimited depth, which makes them both very powerful and very confusing.

Let's use pseudo code, shall we?
To explain how the nondelivering recipes actually work, we're going to cook up a recipe with pseudo code and then advance through writing the same recipe in standard procmail syntax.

For those of you who do not know, pseudo code is a tool that programmers have used for a long time that allows them to write out their code (or small blocks of code) in an English-like language.

The first recipe will extract the From header from an incoming e-mail and append it to the end of the file. This is a cheap and dirty way to keep track of the names and addresses that come in through your system. You can simply cut and paste items from this database into your e-mail address book.

The one thing you have to watch for in this recipe is to make sure that you avoid duplicate entries. This is certainly possible without too much hassle (hey, it's Linux after all!). The recipe (which is actually two recipes in one) in pseudo code looks like this:
start the recipe using a lockfile
use formail to create a database of unique message ids
(this will keep us from having duplicates in our address database)
start the second half of the recipe only if the first recipe
failed to find a match in the database (again, using a lockfile)
use formail to extract the From header and dump it into the database


The real thing
With the pseudo code in place, this should be a fairly simple recipe. The recipe in code looks like this:
:0 Whc: address_cache.lock
| formail -rD 8192 address.cache

:0 ehc: address_cache.lock
| formail -x From: >> address.cache
Let's break this code down, line by line:

  • :0 Whc: address_cache.lock Start the recipe, wait until the filter has been run before sending any exitcode (if the filter is not successful, then the text will not be filtered), feed the header to the pipe (|), make a carbon copy of the mail, and use the lockfile address_cache.lock.
  • | formail -rD 8192 address.cache Use formail to create an auto-reply header and to detect if the unique message ID has been seen in the cache file, address.cache. The cache file will be no bigger than 8192 KB.
  • :0 ehc: address_cache.lock Start the second half of the recipe, do not require empty lines to precede the header, feed the header to the pipe, make a carbon copy of the e-mail, create a lockfile called address_cache.lock.
  • | formail -x From: >> address.cache Start the recipe using a lockfile, use the formail program to extract the From header, and append the header to the address.cache file.

For our first recipe, we’re going to cook up a vacation-notification system that will alert senders of our vacation but will do so only once. This recipe should be placed after any recipe that processes mailing lists. We will accomplish this by using a vacation.cache file (located in the default mail directory) that will keep a record of every notification sent out. With this recipe, we will also avoid sending out notifications to mailing lists or daemons.

To keep this recipe from seeming too overwhelming, we’re going to outline it with pseudo code first:
START
Start the recipe, extract the header, make a carbon copy of the e-mail,
and create a lockfile called vacation.lock
CONDITIONS
make sure the e-mail was addressed to you
make sure you do not reply to daemons and mailing lists
make sure you avoid mail loops
use formail to generate an auto-reply header and extract it to the file
vacation.cache to be used as a database to hold our already sent e-mail
addresses (we want to send the vacation message only once to
each address)
ACTIONS
if the name is not in the above cache file, we'll use sendmail
to reply to the address with our message


The real thing
The real recipe looks quite a bit different than the pseudo code but will perform just as we've outlined. You should note that this is actually two recipes. The first recipe sets up our vacation cache so that we have a database of e-mail addresses collected. You use this database so that you send your auto-responder mailing only once to each address. You will want to create this file, vacation.cache, with the command
touch ~/mail/vacation.cache

The second recipe actually puts together the e-mail to be sent to the extracted address. This second recipe acts only if the first recipe fails.

Here’s the actual recipe:
:0 Whc: vacation.lock
* $^To:.*\<$\LOGNAME\>
* !^FROM_DAEMON
* !^X-Loop: jwallen@techrepublic.com
| formail -rD 8192 vacation.cache

:0 ehc
| (formail -rI"Precedence: junk" \
-A"X-Loop: jwallen@techrepublic.com" ; \
echo "I received your mail,"; \
echo "I'm currently attending LinuxWorld Expo."; \
echo "and will return Friday, August 18."; \
echo "— "; cat $HOME/.signature \
) | $SENDMAIL -oi -t


Now that we've seen the recipe in pseudo code and actual code, let's break it down line by line:
  • :0 Whc: vacation.lock This line starts the recipe, waits until the e-mail has checked successfully before it filters the mail, feeds the header to the destination file (vacation.cache), carbon-copies the e-mail, and creates a lockfile called vacation.lock.
  • * $^To:.*\<$\LOGNAME\> Checks to make sure the e-mail was, in fact, addressed to you.
  • * !^FROM_DAEMON Makes sure that it does not reply to daemons and mailing lists.
  • * !^X-Loop: jwallen@techrepublic.com Avoids mail loops (an e-mail comes from your account to your account and is auto-replied from your account back to your account, which would cause a possibly infinite loop).
  • | formail -rD 8192 vacation.cache The header has been piped to this line, which uses formail to create a cache of e-mail addresses to compare.
  • :0 ehc If the immediate recipe fails, we will execute this recipe, feeding the header to the next pipe and making a carbon copy of the mail. No lockfile is used.
  • | (formail -rI"Precedence: junk" \ Append a custom header field (Precedence), set it to junk, and strip away all other headers.
  • -A"X-Loop: jwallen@techrepublic.com" ; \ Append the X-Loop: header to our newly generated e-mail so that we can guarantee avoidance of e-mail loops.
  • echo * The next three lines (all beginning with echo) are the lines of text you wish to include in your auto-responder.
  • ) | $SENDMAIL -oi -t Pipe the newly created message—everything created with the conditions and actions within ( )—to sendmail for delivery to recipients.

Why is this considered a nondelivering recipe when it obviously delivers to an outside e-mail address? The main reason is because it is using an external program (formail) to generate a response, leaving procmail's sole purpose to write to a file (vacation.cache). Even though the ultimate action is that an auto-generated e-mail is mailed to the sender of the original mail, many times no e-mail will be generated (for instance, when someone's e-mail address is already in the vacation.cache file). Now is it obvious why this is considered a nondelivering recipe?

Conclusion
In this Daily Drill Down we unraveled the mystery of lockfiles, reviewed the basics of recipes, and created a couple of helpful nondelivering recipes. Of course procmail can do so much more! Do not assume that these two Daily Drill Downs are an exhaustive resource. For an exhaustive resource on procmail, please consult your imagination.

About Jack Wallen

Jack Wallen is an award-winning writer for TechRepublic and Linux.com. He’s an avid promoter of open source and the voice of The Android Expert. For more news about Jack Wallen, visit his website jackwallen.com.

Editor's Picks

Free Newsletters, In your Inbox