General discussion


DOWNLOA Use built-in UNIX tools to back up data

By Mark W. Kaelin Editor ·

What is your data backup regimen? How is the process automated? What advice beyond this download do you have for your peers?

This conversation is currently closed to new comments.

Thread display: Collapse - | Expand +

All Comments

Collapse -

Add encryption

by stress junkie In reply to DOWNLOAD: Use built-in UN ...

I really liked the download. It almost mirrors my own work of recent months to do the same thing. I even go against the traditional Unix naming convention by adding the .sh file extension to my Bash scripts, just as you did in the article.

I have also been expermenting with trying to get my backups encrypted. So far the easiest thing that I've come up with is to create encrypted container files on a disk partition that are the same size as the capacity of the backup medium. You mount these encrypted container files through /dev/loop* then mount the /dev/loop* device(s) to regular mount points in the file system. Make the mount point(s)the destination of your tar operation. If your backup will span more than one unit then you can use the GNU tar capability of running a script at the end of the medium. This script will change the destination variable used by the tar operation. I'm still working on this. So if you need to use three DVDs or tapes then you create one encrypted container file for each tape or DVD. You mount them concurrently in different mount points such as /mnt/1, mnt/2, mnt/3. Your script will first set your destination variable to /mnt/1. When that container file fills up then the tar script to execute should change the backup destination variable to /mnt/2 and so on. Another approach might be to unmount /mnt/1 remount /mnt/2 to /mnt1. I don't know yet as I haven't tried it.

Addendum: I've just figured this out. Make a link to your first mount point when the backup script starts. Then the script that runs when GNU tar fills up its medium would recreate the link to point to the next mount point. So you could have encrypted files mounted at /mnt/1, mnt/2, and /mnt/3 with a link /mnt/backup that initially points to /mnt/1. Then when GNU tar fills up this container file it will run your script that will recreate the /mnt/backup link to point to /mnt/2. You could accommodate any number of backup media units like this.

You can create an encrypted container file in much the same way that you would create an encrypted file system on a disk partition. Note that my encrypted container files are on an encrypted partition. That means that the data goes through one decryption operation to get from the partition to RAM then two encryption operations in order to get from RAM to the container file. First it is decrypted through /dev/loop0 to get into RAM. Then it is encrypted through /dev/loop0 to get back to the disk partition. Then it is encrypted through /dev/loop1 to get to the encrypted container file, or vice versa. Amazingly this has very little CPU overhead. (So when people tell me that they don't have the spare CPU cycles to use encrypted file systems I say that they don't know what they're talking about. All data on disk should be encrypted. Then when the disk breaks and you discard it you don't have to worry about confidential data on the disk platters.) You must by sure to run a sync command before you unmount. Sync can take four or five seconds to flush the disk cache onto the disk.

If you are using the twofish256 encryption algorithm you would basically do the following to create an encrypted container file on any disk partition. Note that with 10GB Travan 5 tapes I have found that most of them hold less than the rated capacity. I have found that almost all will hold at least 9.6GB so that is the size of the container files that I make. I will use the file /bkp/tape01.enc for the container file and I will use /dev/loop0 for the loop device in the following example.

Load the encryption module into the kernel if it isn't already available to the kernel.

insmod twofish256

Create a 9.6GB container file for Travan 5 tapes.

dd if=/dev/urandom of=/bkp/tape01.enc bs=1024 count=960000000

Create the connection between the container file and the loop device. Note that losetup will ask you for a password. This password is used to create the encryption key and therefore cannot be changed. You will need the same password to decrypt the data.

losetup -e twofish256 /dev/loop0 /bck/tape01.enc

Create a file system in the container file. The loop device connection will perform the encryption automatically.

mkfs -t ext2 -c -v -b 1024 /dev/loop0

Now you can mount /dev/loop0 like a disk partition.

mount -t ext2 /dev/loop0 /mnt/1

Once you finish performing your backup you can then do the following to get the encrypted backup onto tape or DVD or whatever you use. These instructions are for a Travan 5 tape at /dev/ht0.


umount -d /mnt/1

mt -f /dev/ht0 erase

mt -f /dev/ht0 retension

dd if=/bck/tape01.enc of=/dev/ht0 bs=10240

This takes about three hours with Travan 5 even if you only backed up one byte into the container file. You could use tar -cz to reduce the time by a little bit but since we created the container file using random numbers to fill it the container file may not compress very much.

Note that you can reuse the container files. Start with the losetup command above. Skip the mkfs command. Instead you can use rm to remove the previous tar file(s).

I also like to include files created by tar to document the backup. I perform a verbose backup and have the output redirected to a log file in the /mnt/1 mount point for the container file. I also use the date function to create the names the files. I also use a meaningful name such as rootfs or home to name the tar files. A typical tar file name would be rootfs-2005-01-01.tgz for a backup of the root file system on January 01, 2005.

I back up entire file systems so I use an exclude file list instead of an include file list. I especially don't want to back up the container files so I put *.enc into the exclude file list. I also put *.iso and /download into the include file list so that I don't back up ISO files or the software installation kits like Mozilla that live in the /download directory. (Actually the download directory lives somewhere else but I am reluctant to put my whole file system structure into this post.)

You can have your backup script check to make sure that you have mounted a container file with the following lines of Bash code.

if [ ! -d /mnt/1/lost+found ]; then
echo Encrypted file system not mounted at /mnt/1
echo Exiting this backup script

You can check for backup file name collisions with the following code. The variable BKTARFILE contains the full path name of the tar file that you would create for this backup.

if [ -f $BKTARFILE ]; then
echo Backup file for today already exists.
echo Exiting procedure.
echo ;


I guess that's about it. Some day when I get all of this polished up I'll probably post it in my TR blog.

Collapse -

Just curious to know......

by Choppit In reply to DOWNLOAD: Use built-in UN ...

Why *nix backup script examples never include verification?

Collapse -

GNU tar has verify option

by stress junkie In reply to Just curious to know..... ...

I don't know why people don't use the verify option in example scripts. There are two ways to tell the GNU tar command to verify. These are -W or --verify.

Good question.

Collapse -

remote backups?

by rodkey In reply to DOWNLOAD: Use built-in UN ...

This article is good as far as it goes (single, independent server backing up locally on-disk)
But not terribly helpful for a group of servers
backing up to a remote server, which gives me
a bit more sense of security.

For this task, tar and crontab can also be used,
perhaps doing something like
dt=`date +"%Y-%m-%d"` # note that ymd vs. dmy allows coherent sorting

tar -czf $tarfile /
(adding of course, all the excludes & options mentioned in the article)

followed by
scp $tarfile backupserver:/backups/`hostname`-$tarfile

Of course, you'll need to add into your
automated script a routine to delete all but
the last N weeks of backups, or you will exceed
the capacity of your backup server in short

But the real problem is with bandwidth.
You can solve this somewhat by doing
a full backup once every two weeks or so,
then doing
an incremental backup using the --newer=XXX
where XXX is yesterday's date ( or two days
ago, if you want to be sure).
But even this has problems, especially on
filesystems with lots of changes (like a
file system that contains lots of mbox-style
mail spool files, for instance)

And of course, incrementals produce more
headaches for recovery. To ease this, I
redirect the output of the tar command to a
file named archive-$date.list . This file
can be easily searched for files of interest
by going to the backup directory and issuing a
command like

grep -l -i "myfilename" *.list | sed 's/.list/.tgz/'

This gives the filename of the tar file containing
an incremental of the file of interest.

Another approach that has some merit is using rsync to create a duplicate of your original
filesystem on the backup host. This is feasible
with the large disks available now for most medium sized filesystems.

rsync does comparison of file creation time
so only the minimum amount of data is transferred.
It also does pipelining, so it should be very
efficient in reading and writing the requisite data.

rsync -a source backupserver:/destination
is the general syntax for this, although you
can add complexity with exclude lists and the like
by reading the manaul for rsync and picking the
features that suit you.

Related Discussions

Related Forums