Web Development

Compare file contents and render the output with PHP and PEAR

The Text_Diff PEAR class makes it possible to compare file contents in the PHP environment and render the output in various formats. This tutorial will demonstrate this class in action, illustrating how you can use it to dynamically compare file contents with PHP and render the results as a Web page.

This article is also available as a TechRepublic download.

When it becomes necessary to compare two or more text files in UNIX, most developers reach for the diff program. This program, included by default in almost all UNIX distributions, compares the files line by line and displays the changes between them in a number of different output formats.

Though diff originally was (and still is) a command-line utility, packages replicating its functionality are available for most development environments and languages, including Perl, JSP, and PHP. And so we come to Text_Diff, a PEAR class that makes it possible to compare file contents in the PHP environment and render the output in various formats.

This tutorial will demonstrate this class in action, illustrating how you can use it to dynamically compare file contents with PHP and render the results as a Web page. I'll assume here that you have a working Apache and PHP installation and that the PEAR Text_Diff class has been correctly installed.

Note: You can install the PEAR Text_Diff package directly from the Web, either by downloading it or by using the instructions provided.

Setting up test files

Before writing any code, it's necessary to set up the test files we'll be using in this tutorial. These are two simple files, with some deliberate differences that Text_Diff should be able to pick up on. Listing A is the first file, named data1.txt.

Listing A

apple
banana
cantaloupe
drumstick
enchilada
fig
grape
horseradish

And Listing B is the second file, named data2.txt.

Listing B

apple
bat
cantaloupe
drumstick
enchilada
fig
peach
pear



zebra

Performing basic comparison

Having set up the files, let's begin with a simple illustration of how Text_Diff works. Start with the script in Listing C.

Listing C

<?php
// adjust file paths as per your local configuration!

include_once "Text/Diff.php";
include_once "Text/Diff/Renderer.php";

// define files to compare
$file1 = "data1.txt";
$file2 = "data2.txt";

// perform diff, print output
$diff = &new Text_Diff(file($file1), file($file2));
$renderer = &new Text_Diff_Renderer();
echo $renderer->render($diff);
?>

This is fairly simple at first glance. There are two basic classes in the Text_Diff package: Text_Diff(), which actually performs the comparison and returns diffoutput; and Text_Diff_Renderer(), which formats the diff output into a format that is easily understandable. The Text_Diff() object, in particular, must be initialized with the actual contents (and not the locations) of the two files to be compared.

The script begins by initializing these two objects, making use of PHP's file() function to extract the contents of each file as a series of arrays. The Text_Renderer() object is then used to render the output in standard diff format, producing output which should be familiar to any UNIX developer:

2c2
<banana
---
>bat
7,8c7,12
<grape
<horseradish
---
>peach
>pear
>
>
>
>zebra

Making differences easier to read

Now, the output above is not particularly easy to read unless you have lots of experience at decoding diff results. That's why Text_Diff comes with a couple of options to reformat this output into something more readable. These options are accessible as child classes of the Text_Diff_Renderer() object and make it possible to view comparison results in either unified or inline format.

The following script (Listing D) modifies the previous example to demonstrate unified format:

Listing D

<html>
<head></head>
<body>

<pre>
<?php
// adjust file paths as per your local configuration!

include_once "Text/Diff.php";
include_once "Text/Diff/Renderer.php";
include_once "Text/Diff/Renderer/unified.php";

// define files to compare
$file1 = "data1.txt";
$file2 = "data2.txt";

// perform diff, print output
$diff = &new Text_Diff(file($file1), file($file2));
$renderer = &new Text_Diff_Renderer_unified();
echo $renderer->render($diff);
?>
</pre>

</body>
</html>

Notice the call to the appropriate child class when initializing the renderer.

And here's the output:

@@ -1,8 +1,12 @@
apple
-banana
+bat
cantaloupe
drumstick
enchilada
fig
-grape
-horseradish
+peach
+pear
+
+
+
+zebra

A quick explanation is in order here: in the unified format, the plus (+) prefix indicates additional lines, the minus (-) prefix indicates deleted lines, and no prefix indicates unchanged lines. Comparing the output above with the original files, it's fairly easy to see how the diff output reflects which lines have changed and what the changes are.

Of course, it's possible to make it even more user-friendly -- and that's precisely what inline formatting tries to accomplish. In this format, strikethroughs are used to visually indicate which characters and lines have changed. Listing E shows you how to use it.

Listing E

<html>
<head></head>
<body>

<pre>
<?php
// adjust file paths as per your local configuration!

include_once "Text/Diff.php";
include_once "Text/Diff/Renderer.php";
include_once "Text/Diff/Renderer/inline.php";

// define files to compare
$file1 = "data1.txt";
$file2 = "data2.txt";

// perform diff, print output
$diff = &new Text_Diff(file($file1), file($file2));
$renderer = &new Text_Diff_Renderer_inline();
echo $renderer->render($diff);
?>
</pre>

</body>
</html>

And here's the output:

apple
bananabat
cantaloupe
drumstick
enchilada
fig
grape
horseradishpeach

pear



zebra

And that's about it for this tutorial. Hopefully you now have a clear idea of how Text_Diff can be used to rapidly and efficiently compare files in the PHP environment and how the output can be formatted for easy readability. Happy coding!

0 comments

Editor's Picks