In March, a new Google product called Zopfli was released to the public with little notice or fanfare. Realistically, the general public does not find compression algorithms as interesting as a new smartphone. IT professionals, however, should take note of this exciting new advancement in data compression.
Zopfli is a compression algorithm that is compatible with the DEFLATE algorithm used in zlib, allowing it to be used seamlessly with already deployed programs and devices that support the standard. Zopfli produces files that are 4-8% smaller than zlib at the expense of being substantially slower to compress a file than other implementations of the DEFLATE algorithm. Zopfli is the brainchild of Dr. Jyrki Alakuijala and Lode Vandevenne, who wrote the program as part of their “20% Time” at Google. It is distributed under the Apache 2.0 license.
As Zopfli produces files compatible with the DEFLATE algorithm, client systems do not require any modification to use Zopfli-compressed files, making the deployment of this technology in your product or service relatively simple.
A performance comparison of Zopfli and other implementations of the DEFLATE compression algorithms is provided in the Zopfli documentation. In this example, 100 million bytes of text from the English Wikipedia (“enwik8”) was compressed in Zopfli and kzip with default arguments, gzip with -9, and 7-Zip in DEFLATE (not LZMA) mode at –mx9 , both of which are the highest available settings for those programs.
From this example, the Zopfli-compressed file is 1,449,492 bytes smaller than the gzip-compressed file, but it took 7.5 minutes longer to compress the file. Zopfli and gzip output the gzip file format, whereas 7-Zip and kzip output the zip file format. The header overhead difference between the gzip and zip file formats is below 0.0001% of the output size.
The time it takes to compress the files makes Zopfli unsuitable for data that needs to be compressed on the fly. Instead, the strength of Zopfli is for binary blobs that change infrequently, if ever, or are downloaded with enough frequency that a great deal of concern should be paid to optimizing compression to increase download speed.
The benefits for cloud applications should not be understated: if your bandwidth usage on Amazon S3 is 10 TB per month, achieving 5% better compression of files, thereby reducing bandwidth served on S3, equals a savings of about $1200 per year. Larger projects, or bandwidth-heavy circumstances, such as pushing a software update via Amazon S3, would benefit immensely from aggressively compressing the data transmitted through the cloud.
Where this technology truly shines is in mobile, where tighter compression results in reduced battery use and less strain on the subscriber’s data plan. Reverting back to the previous example of software updates, the cost involved of pushing software or firmware updates over a 3G or 4G connection to millions of subscribers with a particular model phone is a circumstance in which every byte costs money. Here, it is vital to squeeze every ounce of performance out of the network infrastructure that one can, and with aggressive compression found in Google Zopfli, you reduce network load for such tasks.
From the app developer side, the APK file format, like the JAR file format it is based on from Java, store program files inside of what is otherwise a standard DEFLATE-compatible ZIP file. Using Zopfli as a stand-in compressor in your IDE of choice to create your distributable APK or JAR files, combined with the properly-configured use of the optimization and obfuscation tool ProGuard, creates a very compact distributable binary file for your users.
For web designers, Zopfli can also be used to optimize the IDAT chunks in a PNG image, as this portion of the file is compressed with DEFLATE, this data can be optimized with Zopfli. Various attempts by curious programmers at writing a Zopfli-based processor for PNG images can be found on GitHub.
Overall, the benefits inherent in the Zopfli compression algorithm are worth the time required to implement the library and compress the files in your project. With this knowledge in hand, you can squeeze data together more tightly, and squeeze every penny of your cloud budget until Lincoln screams.
James Sanders is a Java programmer specializing in software as a service and thin client design, and virtualizing legacy programs for modern hardware. James is currently an education major at Wichita State University in Kansas.