Open Source

Automating common tasks with cron

One of the lesser-known gems in every Linux distribution must be cron, a tool that can automatically execute routine tasks at predefined intervals. When it comes to commands (or scripts) that must be performed on a regular basis, administrators and developers naturally reach for cron. Here's a beginner's guide to this powerful tool.

Look under the hood of any modern Linux distribution, and you'll find an assortment of gems hidden inside. There's the Vi editor for all of your text processing needs; the SSH shell for secure connections; the Apache server for Web site publishing; the Samba suite of tools for file sharing...the list goes on and on.

One of the lesser-known lights, though, must be cron, a tool that can automatically execute routine tasks at predefined intervals. When it comes to commands (or scripts) that must be performed on a regular basis, administrators and developers naturally reach for cron.

Starting cron

If you have a relatively recent Linux distribution, some version of cron is sure to be included in the base system. The most common version is Vixie cron, named after its creator, Paul Vixie, but numerous other variants also exist. Normally, the installation process will also install startup and shutdown scripts into the /etc/rc.d/ directory hierarchy, which will take care of starting and stopping cron when the system boots up or shuts down.

To check if cron is running, look for the crond binary in the process list, as follows:

$ ps ax | grep crond
  277 ?        S      0:00 /usr/sbin/crond -l10

If it's running, great—you're all set! If it's not, then you'll have to locate the binary and start it manually, like this:

$ find / -name crond
/usr/sbin/crond
$ /usr/sbin/crond

You can also consider adding the startup command to your startup scripts, (/etc/rc.local is usually a good bet) so it starts automatically on your next boot.

In the unusual event that your distribution doesn't automatically install cron for you, visit the official Web site for your distribution (or look in the supplied CD-ROM), and you should be able to find and install cron.

Understanding crontabs

Cron works by reading timetables from special files called crontabs. Normally, there's a system crontab file in /etc/crontab, which looks something like this (again, this will vary depending on your distribution):

# update locate database once a week
30 4 * * 0 /usr/bin/updatedb -c 1> /dev/null

# rotate logs daily
00 01 * * * /usr/sbin/logrotate /etc/logrotate.cf -c 1> /dev/null

Some distributions, such as Red Hat and Slackware, instead create directories named /etc/cron.daily, /etc/cron.hourly, and so on, and set up cron by default to run the scripts inside these directories at the specified frequency.

Cron also allows users to create their own personal crontabs. To do this, use the following command when logged in as a regular user:

$ crontab -e

You'll be presented with a blank crontab file that you can edit. When the crontab file is saved, the entries inside it will be added to the /var/spool/cron/ directory.

When cron first starts up, it looks in /var/spool/cron and /etc/cron.d/ for crontabs and loads them into memory along with the system-level crontab. Cron then checks its schedule on a minute-by-minute basis for tasks to be executed. If it finds any, it executes them and e-mails the output of the command to the owner of the crontab (more on this later).

Every entry in a crontab file represents an action to be performed and consists of a series of six fields, separated by blank spaces. Here's what each field represents:

  • Field #1: The minute(s) of the hour at which the command is to be executed (0 to 59)
  • Field #2: The hour(s) at which the command is to be executed (0 to 23)
  • Field #3: The day of the month on which the command is to be executed (1 to 31)
  • Field #4: The month in which the command is to be executed (1 to 12)
  • Field #5: The day of the week on which the command is to be executed (0 to 6, where 0 = Sunday)
  • Field #6: The command or script to run

Here's an example:

15 16 04 06 * /usr/local/script.sh

This means "on the 4th of June every year at 04.15 PM, run the command script.sh."

The command or script specified in the last field of the crontab must be executable. To make a script executable, use this command:

$ chmod +x <script-name>

Now let's take a look at some of the more advanced aspects of cron.

Fine-tuning crontab schedules

Now that you have the basics of cron under your belt, let's take a look at some of the more advanced aspects. We'll start by looking at the interesting things you can do with the time fields in a crontab entry:

1. You can specify a range of times for each field by using a hyphen. For example, the entry

00 08 01-15 * * /usr/local/backup.sh

means "run the script backup.sh at 8 AM every day for the first 15 days of every month."

2. You can specify multiple time values for a field by separating them with commas. For example, the entry

05 06,08,10,12,14,16,18,20 * * * /usr/local/send.mail

means "run the script send.mail every two hours at five minutes past the hour between 6 AM and 8 PM every day of every month."

3. You can also use so-called step values to have cron skip particular elements of a group or range. Just insert a forward slash after the value range to specify the step quantity. This means that the following entry is equivalent to the one above:

05 06-20/2 * * * /usr/local/send.mail

4. You can use shortcut syntax to specify "every possible value" for a time field. This is accomplished by using an asterisk, and it's far more readable than the equivalent range or group syntax. For example, the entry

30 23 * * * /usr/local/clean.pl

means "run the script clean.pl at 11.30 PM every day of every month."

5. You can use human-readable, three-character day and month names instead of numbers. Note that if you specify both a day of the month (field #3) and a day of the week (field #5), cron treats it as an OR condition and runs the command on both the specified day(s) of the month and the specified day(s) of the week. For example, the entry

00 11 10,20 * mon,wed,fri /usr/sbin/updatedb

means "run the command updatedb at 11 AM on the 10th and 20th of every month, and also on every Monday, Wednesday, and Friday."

To list the contents of a crontab, use:

$ crontab -l

To delete a previously scheduled task from a cron entry, remove the corresponding line in the crontab and save the file. To delete the crontab file itself, use:

$ crontab -d

Handling cron mail

Normally, the output generated by every command in the system crontab is automatically mailed to the system administrator (or, in the case of a user crontab, to the user who owns it). If you have a lot of crontab entries, this is a quick way of getting snowed under with e-mail. To avoid this, you can redirect the output of the command/script in the crontab to the system bitbucket /dev/null with the redirection operator:

15 16 * * * /usr/local/script.sh > /dev/null

An alternate (and more elegant) way to do this is simply to redirect the output of cron runs to a special account, which the system administrator can check on a regular basis. This separates cron mail from "regular" mail and helps reduce the flow of mail coming into the system administrator's account, while simultaneously maintaining a record of errors and cron messages.

First, create an account for cron and then add a line referencing the special $MAILTO variable to the top of the crontab file:

MAILTO=<cron-account-name>

Once this crontab is saved, all cron mail generated by the entries in the crontab will be redirected to the named account.

Three other variables can also be used in a crontab:

  • $SHELL: The shell to use when running commands or scripts from the crontab
  • $HOME: The home directory to use
  • $PATH: The default search path in which to look for commands and scripts

Automating disk usage reports with cron

Now let's look at a real-world example. A common use for cron is generating and sending e-mail reports of disk usage to system administrators. The script in Listing A does exactly that.

Nothing very difficult here—just a Bash script that creates a report by running the df and du commands to calculate the disk usage in different directories, and concatenates the results into a single e-mail message. This message then gets e-mailed to the system's root user.

Next, you need to set up a cron entry to run this script daily:

00 08 * * mon-fri /usr/local/diskwatch.sh

This will run the script above every day at 8 AM, with the exception of Saturdays and Sundays.

Important note: The user into whose crontab the above entry is inserted must have appropriate permissions to run the df and du commands; otherwise, the command will fail with errors when cron tries to run it. A quick way to test this is to run the command from the shell prompt as that user, and watch to see if it executes correctly or generates errors.

Here's a sample of the report generated:

From root Tue Jul 13 05:49:42 2004
From: Disk Usage Monitor <devnull@localhost>
Subject: Disk usage on / on Tue Jul 13 05:49:42 IST 2004
To: <root@localhost>

FILESYSTEM USAGE:
Filesystem           1k-blocks      Used Available Use% Mounted on
/dev/hda2              1984044    789972   1091660  42% /
/dev/hda1              2096160   1721216    374944  83% /mnt/hd

USAGE IN /home:
4.0k  /home/ftp
3.5M  /home/john
102M  /home/timothy
57.6M /home/barry
10k   /home/joe

USAGE IN /tmp:
328M  /tmp

Obviously, you can tune this script to make it more sophisticated. For example, you can calculate how many users have home directories over a specific size and flag those users in the report, or list all files in /tmp that are over 60 days old. I'll leave the variants to you. Have fun, and if you come up with something really cool, post it into the article discussion below!

Editor's Picks

Free Newsletters, In your Inbox