Apps

Create ZIP archives dynamically and on the fly with Perl


This article is also available as a TechRepublic download, which contains the code listings in a copy-and-paste-friendly text format.

Perl comes with a wide variety of both built-in functions and external modules to manipulate different file formats. In particular, it has the ability to dynamically create and read compressed ZIP archives without relying on external tools and utilities, via its Archive::Zip module. This tutorial runs you through the basics, illustrating the most common uses of this module.

To begin with, download and install the module (if you don't already have it) by running the following command at your Perl prompt:

perl> perl -MCPAN -e "install Archive::Zip"

It's important to note that Archive::Zip depends on the zlib library and so you may be asked to download and install this library as well, during the installation process.

Creating ZIP archives

Let's begin with a simple example: dynamically creating a ZIP archive that contains a few other files. Type (or copy) the script shown in Listing A:

Listing A

#!/bin/perl

use Archive::Zip;   # imports

$obj = Archive::Zip->new();   # new instance

@files = ('mystuff/ad.gif',
          'mystuff/alcon.doc',
          'mystuff/alcon.xls');   # files to store

foreach $file (@files) {
    $obj->addFile($file);   # add files
}

if ($obj->writeToFileNamed('dummy.zip') != AZ_OK) {  # write to disk
    print "Error in archive creation!";
} else {
    print "Archive created successfully!";
}

This script is quite simple, but it's worth looking at it in detail. The first step is to import the Archive::Zip module, and initialize an empty instance of an Archive::Zip object. Next, a list of all the files to be packaged, together with their disk locations, is saved as a Perl array. It's important to remember that the script (more precisely, the user the script runs as) must have permission to access these disk locations or else the archive creation process will fail.

A foreach() loop is then used to iterate over this array, adding the files listed within it to the archive via the object's addFile() method. Once the loop has completed executing, the final archive is written to disk via a call to the writeToFileNamed() method, which accepts the full path and name of the ZIP file to be created. Remember that the script must have permission to write the file to the named disk location; if not, the writeToFileNamed() method will fail and the archive will not be created.

Note, in particular, the return value of the call to writeToFileNamed():. If the file was correctly written, Archive::Zip returns a value of AZ_OK, and this should be checked before proceeding. You'll see this return value in use further along in this document as well.

To use the example script above, modify the contents of the @files array to reflect your local system configuration and try executing it, either at the command prompt or through your browser. If all goes well, the script should find and read your files into a single ZIP archive named dummy.zip.

Viewing ZIP archive contents

What about looking inside an existing archive? Archive::Zip comes with a read() method that reads the contents of an archive and provides access to detailed information about individual members. Listing B is an example of it in action:

Listing B

#!/bin/perl

use Archive::Zip;   # imports

$obj = Archive::Zip->new();   # new instance

$status = $obj->read("dummy.zip");  # read file contents

if ($status != AZ_OK) {
    die('Error in file!');
} else {
    foreach $member ($obj->members()) { # print file information
        print $member->fileName(), ", ", $member->uncompressedSize(), ":", $member->compressedSize(), "\n";
    }
}

Here, the read() method is used to read the ZIP archive and obtain information on its contents. A call to the members() method then returns a structured array of objects, with each array element representing an individual file from the archive. Typically, each object within the array holds information on the name of the corresponding file, its permission mode, its status, its compression type, its size, and the time of last modification. It's fairly easy to extract this information with a loop and reformat it to make it more presentable, as the example Listing B above does.

Here's a sample of the output:

mystuff/ad.gif, 1447:345
mystuff/alcon.doc, 200:34
mystuff/alcon.xls, 28580:21483
...

Inserting files into existing ZIP archives

If you already have a ZIP archive and simply need to add a new file to it, it's extremely simple: Just call the addFile() method with the name and path to the file to be added. To illustrate, let's go back to dummy.zip and try adding some new files to it. (Listing C)

Listing C

#!/bin/perl

use Archive::Zip;   # imports

$obj = Archive::Zip->new();   # new instance

$status = $obj->read('dummy.zip');  # read file contents

if ($status != AZ_OK) {
    die('Error in file!');
} else {
    @files = ('otherstuff/logo.gif',    # files to add
              'otherstuff/header.gif',
              'morestuff/berlin-bear.psd');

    foreach $file (@files) {
        $obj->addFile($file);   # add files
    }

    if ($obj->overwrite() != AZ_OK) {   # overwrite archive with new contents
        print "Error in archive creation!";
    } else {
        print "Archive created successfully!";
    }
}

The procedure to insert a file into an existing archive is very similar to that of creating a new archive: Initialize a new Archive::Zip object and read the original archive into it, create an array holding the list of files to be added, and pass this array to the addFile() method in a loop. Once the files are successfully added, the overwrite() method can be used to overwrite the original archive with the updated one.

Extracting files from existing ZIP archives

Once you've got the files into the archive, how do you get them out? With the extractTree() method, obviously! This method lets you extract all the files from an existing archive to a specified directory, as illustrated in Listing D:

Listing D

#!/bin/perl

use Archive::Zip;   # imports

$obj = Archive::Zip->new();   # new instance

$status = $obj->read('dummy.zip');  # read file contents

if ($status != AZ_OK) {
    die('Error in file!');
} else {
    $obj->extractTree(undef, "/tmp/");    # extract files
}

Here, the extractTree() method unpackages the entire archive and extracts all the files within it to a specified directory. If this directory does not exist, it will be created automatically by extractTree(), assuming of course that the script has permission to write to disk.

Interestingly, it's also possible to perform more selective extraction, where only files that match a predefined list are extracted. To do this, you need to use the extractMember() method, which extracts compressed files one by one, enabling you to place an intermediate filter or check before going ahead with the extraction. Examples of this may be seen in the module's documentation.

Flexibility

As the examples illustrate, Perl's Archive::Zip class is quite versatile and allows you a great deal of flexibility when manipulating ZIP archives. Hopefully, the sample scripts above whetted your appetite and you're now going to play with the class yourself and find out a little more about how it works. Happy coding!

0 comments