Look under the hood of any modern Linux distribution, and
you’ll find an assortment of gems hidden inside. There’s the Vi editor for all of
your text processing needs; the SSH shell for secure connections; the Apache server for Web
site publishing; the Samba suite of tools for file sharing…the list goes on
and on.

One of the lesser-known lights, though, must be cron, a tool
that can automatically execute routine tasks at predefined intervals. When it
comes to commands (or scripts) that must be performed on a regular basis,
administrators and developers naturally reach for cron.

Starting cron

If you have a relatively recent Linux distribution, some
version of cron is sure to be included in the base system. The most common
version is Vixie cron, named after its creator, Paul Vixie, but numerous other
variants also exist. Normally, the installation process will also install
startup and shutdown scripts into the /etc/rc.d/ directory hierarchy, which
will take care of starting and stopping cron when the system boots up or shuts
down.

To check if cron is running, look for the crond binary in the process list, as
follows:

$ ps ax | grep crond
  277 ?        S      0:00 /usr/sbin/crond -l10

If it’s running, great—you’re all set! If it’s not, then
you’ll have to locate the binary and start it manually, like this:

$ find / -name crond
/usr/sbin/crond
$ /usr/sbin/crond

You can also consider adding the startup command to your startup
scripts, (/etc/rc.local is usually a good bet) so it starts automatically on your
next boot.

In the unusual event that your distribution doesn’t
automatically install cron for you, visit the official Web site for your
distribution (or look in the supplied CD-ROM), and you should be able to find
and install cron.

Understanding crontabs

Cron works by reading timetables from special files called crontabs.
Normally, there’s a system crontab file in /etc/crontab, which looks something like
this (again, this will vary depending on your distribution):

# update locate database once a week
30 4 * * 0 /usr/bin/updatedb -c 1> /dev/null

# rotate logs daily
00 01 * * * /usr/sbin/logrotate /etc/logrotate.cf -c 1> /dev/null

Some distributions, such as Red Hat and Slackware, instead
create directories named /etc/cron.daily, /etc/cron.hourly, and so on, and set
up cron by default to run the scripts inside these directories at the specified
frequency.

Cron also allows users to create their own personal
crontabs. To do this, use the following command when logged in as a regular
user:

$ crontab -e

You’ll be presented with a blank crontab file that you can
edit. When the crontab file is saved, the entries inside it will be added to
the /var/spool/cron/ directory.

When cron first starts up, it looks in /var/spool/cron and
/etc/cron.d/ for crontabs and loads them into memory along with the
system-level crontab. Cron then checks its schedule on a minute-by-minute basis
for tasks to be executed. If it finds any, it executes them and e-mails the
output of the command to the owner of the crontab (more on this later).

Every entry in a crontab file represents an action to be
performed and consists of a series of six fields, separated by blank spaces.
Here’s what each field represents:

  • Field #1: The minute(s) of the
    hour at which the command is to be executed (0 to 59)
  • Field #2: The hour(s) at which the
    command is to be executed (0 to 23)
  • Field #3: The day of the month on
    which the command is to be executed (1 to 31)
  • Field #4: The month in which the
    command is to be executed (1 to 12)
  • Field #5: The day of the week on
    which the command is to be executed (0 to 6, where 0 = Sunday)
  • Field #6: The command or script to
    run

Here’s an example:

15 16 04 06 * /usr/local/script.sh

This means “on the 4th of June every year at 04.15 PM,
run the command script.sh.”

The command or script specified in the last field of the
crontab must be executable. To make a script executable, use this command:

$ chmod +x <script-name>

Now let’s take a look at some of the more advanced aspects of cron.

Fine-tuning crontab schedules

Now that you have the basics of cron under
your belt, let’s take a look at some of the more advanced aspects. We’ll start
by looking at the interesting things you can do with the time fields in a
crontab entry:

1. You can specify a range of times for each field by using
a hyphen. For example, the entry

00 08 01-15 * * /usr/local/backup.sh

means “run the script backup.sh at 8 AM every day for
the first 15 days of every month.”

2. You can specify multiple time values for a field by
separating them with commas. For example, the entry

05 06,08,10,12,14,16,18,20 * * * /usr/local/send.mail

means “run the script send.mail every two hours at five
minutes past the hour between 6 AM and 8 PM every day of every month.”

3. You can also use so-called step values to have cron skip
particular elements of a group or range. Just insert a forward slash after the
value range to specify the step quantity. This means that the following entry
is equivalent to the one above:

05 06-20/2 * * * /usr/local/send.mail

4. You can use shortcut syntax to specify “every
possible value” for a time field. This is accomplished by using an
asterisk, and it’s far more readable than the equivalent range or group syntax.
For example, the entry

30 23 * * * /usr/local/clean.pl

means “run the script clean.pl at 11.30 PM every day of
every month.”

5. You can use human-readable, three-character day and month
names instead of numbers. Note that if you specify both a day of the month
(field #3) and a day of the week (field #5), cron treats it as an OR condition
and runs the command on both the specified day(s) of the month and the
specified day(s) of the week. For example, the entry

00 11 10,20 * mon,wed,fri /usr/sbin/updatedb

means “run the command updatedb at 11 AM on the 10th
and 20th of every month, and also on every Monday, Wednesday, and Friday.”

To list the contents of a crontab, use:

$ crontab -l

To delete a previously scheduled task from a cron entry,
remove the corresponding line in the crontab and save the file. To delete the
crontab file itself, use:

$ crontab -d

Handling cron mail

Normally, the output generated by every command in the
system crontab is automatically mailed to the system administrator (or, in the
case of a user crontab, to the user who owns it). If you have a lot of crontab
entries, this is a quick way of getting snowed under with e-mail. To avoid
this, you can redirect the output of the command/script in the crontab to the
system bitbucket /dev/null with the redirection operator:

15 16 * * * /usr/local/script.sh > /dev/null

An alternate (and more elegant) way to do this is simply to
redirect the output of cron runs to a special account, which the system
administrator can check on a regular basis. This separates cron mail from
“regular” mail and helps reduce the flow of mail coming into the
system administrator’s account, while simultaneously maintaining a record of
errors and cron messages.

First, create an account for cron and then add a line
referencing the special $MAILTO variable to the top of the crontab file:

MAILTO=<cron-account-name>

Once this crontab is saved, all cron mail generated by the
entries in the crontab will be redirected to the named account.

Three other variables can also be used in a crontab:

  • $SHELL: The shell to use when
    running commands or scripts from the crontab
  • $HOME: The home directory to use
  • $PATH: The default search path in
    which to look for commands and scripts

Automating disk usage reports with cron

Now let’s look at a real-world example. A common use for cron
is generating and sending e-mail reports of disk usage to system
administrators. The script in
Listing A does exactly that.

Nothing very difficult here—just a Bash script that creates
a report by running the df and du commands to calculate the disk usage
in different directories, and concatenates the results into a single e-mail
message. This message then gets e-mailed to the system’s root user.

Next, you need to set up a cron entry to run this script
daily:

00 08 * * mon-fri /usr/local/diskwatch.sh

This will run the script above every day at 8 AM, with the
exception of Saturdays and Sundays.

Important note: The user into whose crontab the above entry
is inserted must have appropriate
permissions to run the df and du commands; otherwise, the command will
fail with errors when cron tries to run it. A quick way to test this is to run
the command from the shell prompt as that user, and watch to see if it executes
correctly or generates errors.

Here’s a sample of the report generated:

From root Tue Jul 13 05:49:42 2004
From: Disk Usage Monitor <devnull@localhost>
Subject: Disk usage on / on Tue Jul 13 05:49:42 IST 2004
To: <root@localhost>

FILESYSTEM USAGE:
Filesystem           1k-blocks      Used Available Use% Mounted on
/dev/hda2              1984044    789972   1091660  42% /
/dev/hda1              2096160   1721216    374944  83% /mnt/hd

USAGE IN /home:
4.0k  /home/ftp
3.5M  /home/john
102M  /home/timothy
57.6M /home/barry
10k   /home/joe

USAGE IN /tmp:
328M  /tmp

Obviously, you can tune this script to make it more
sophisticated. For example, you can calculate how many users have home
directories over a specific size and flag those users in the report, or list
all files in /tmp that are over 60 days old. I’ll leave the variants to you. Have
fun, and if you come up with something really cool, post it into the article
discussion below!