Clean potentially harmful metadata from Office documents with ezClean

Remove potentially harmful metadata from Microsoft Office documents using KKL Softwares ezClean

If you regularly send Microsoft Word, Excel, or PowerPoint documents to your colleagues or clients via e-mail attachments, you're probably sending the documents to expedite your business operations. However, you may also be setting yourself up for sabotage.

While you may not realize it, Word, Excel, and PowerPoint documents are filled with metadata, which is basically hidden information that is stored in the document. Some of this metadata, such as author name or company name, can be considered innocuous. However, other metadata, which can include previous versions of the document, tracked changes, author comments, and even deleted text, could be harmful to your business if it were to fall into the wrong hands.

Fortunately, you don't have to live in fear of inadvertently exposing your business to sabotage. Instead, you can use ezClean from KKL Software to remove metadata from Word, Excel, or PowerPoint documents. This easy-to-use program can be configured to be used manually or to run automatically from Outlook.

How is metadata harmful?
What is considered to be the most potentially harmful metadata actually has enormous benefits during the document creation and editing process. And it’s important to realize that this potentially harmful metadata isn’t automatically added to Office documents by default. In fact, users must go out of their way to enable certain features that by their very nature add potentially harmful metadata to a document.

The real danger comes from not understanding exactly what happens behind the scenes when you enable features that add metadata to a document. Problems are further complicated by users not understanding that you should remove that metadata from any document that is to be electronically sent outside the enterprise.

Let’s take a closer look at some of the features that can add potentially harmful metadata to a document. Since Word is one of the biggest culprits when it comes to potentially harmful metadata, I'll focus on it for my example.

A partial list
As you look through this list of features, keep in mind that I’ve highlighted only the metadata that could cause the most harm if left in a document that is electronically shared outside the enterprise. There are lots of other features that add metadata to Microsoft Office documents. For example, metadata can exist in templates, styles, links, macros, and routing slips.

If you want a more comprehensive list of the Office features that add metadata to documents, investigate the Microsoft Knowledge Base article 223396—How to Minimize Metadata in Microsoft Office Documents. This article provides more information on metadata, as well as links to other articles that detail all the settings you may have to manually alter to minimize or remove metadata from documents created with Word, Excel, and PowerPoint versions 97, 2000, and 2002.

The Track Changes feature
As you’re editing a document, especially if you’re collaborating with another author or editor, you may need to use Word's Track Changes feature to keep track of any changes that you or your collaborators make to the document. When you use Track Changes, Word keeps track of any text that is deleted, added, or otherwise modified and stores that information in the document. Under normal circumstances, those changes are highlighted in different colors so they can later be accepted or rejected. Once changes are dealt with in this fashion, the alternate data is removed from the document.

However, you can disable the Track Changes feature without having to first accept or reject the changes. When you do so, the changes are still stored in the document, but not shown on the screen or in print. The goal of this option is to allow you to be able to analyze the text of the document without all the visual interference. But that's also assuming that you’ll reenable Track Changes and go back and accept or reject the changes later. If you forget to perform this last step, the changes remain in the document and anyone can later reenable the Track Changes feature and see those changes.

When using Word’s Track Changes feature, it’s common practice to use the Comments feature to provide detailed explanations for the changes made or to suggest additional collaboration. Typically, in a collaborative situation, one author will respond to another’s comments with comments of his or her own. Areas in a document containing comments are highlighted, but the actual text of the comment doesn’t appear onscreen unless specifically selected. It then appears in a balloon.

Since the comments are essentially hidden, it’s up to you to specifically remember to delete comments from the document before distributing it. If you forget to remove the comments, anyone viewing the document later can read those comments.

When you’re creating or editing a document, you may take advantage of Word’s Versions feature to keep track of changes that you make to a document. When you use this feature, Word saves multiple versions of a document within the same document. Of course, Word saves only the differences between versions, not an entire copy of each version. You can then easily undo a series of changes to a document simply by returning to a previous version.

Once you’re finished with a document, it’s up to you to remember to go back to the Versions feature and delete any previous versions of the file. If you fail to do so, anyone viewing the document later can take a look at your previous versions and see the changes you’ve made.

Fast Save
The Fast Save feature is designed to speed up the save operation by recording only incremental changes in a document. In the process, the changes are actually appended to the document rather than overwriting the existing text. Word then keeps track of all these various appendages and pieces them together each time you open the document.

Once you’re finished with a document, remember to disable the Fast Save feature to allow Word to perform a full save. If you don't, anyone viewing the document later can see the changes you’ve made.

Downloading and installing ezClean
You can download a 45-day evaluation copy of ezClean from the Downloads page of KKL Software’s Web site. Once you download ezClean, you’ll find that the installation is contained in a Windows Installer Package. All you have to do is double-click on the MSI file to start the installation wizard.

After you install ezClean, you’ll find that the evaluation copy is fully functional and that you can use it to clean as many documents as you want. Keep in mind that once each day, ezClean will present a dialog box that displays the expiration date and prompts you to either license the product or continue the trial. After 45 days, the only option is to license the product, and unless you do so, you can’t run ezClean.

For current pricing information, you can contact the KKL Software Sales department Monday through Friday, 9 A.M. to 5 P.M. Eastern by calling (212) 692-5660, or you can send e-mail to ezClean works with Microsoft Office 2000 and Microsoft Office XP. It can be installed in all versions of the Windows operating system.

Additional documentation
On the ezClean download page, you’ll find two links that will allow you to download additional documentation as PDF files. One of these is the User’s Guide, and the other is the Installation Guide And Admin Manual. While ezClean’s built-in Help system does an excellent job of documenting the program, I recommend that you download both of these PDF files for supplementary reference material.

Using ezClean
When you install ezClean, you won’t find a shortcut on the Start menu. Instead, you’ll find an ezClean button on the Standard toolbar of Word, Excel, and PowerPoint. When you want to manually clean out any metadata in the document you’re currently working on, simply click the ezClean button on the toolbar. ezClean will then analyze the document for metadata and display the ezClean dialog box. For example, I decided to check a Word document—VBScript Article—that I was working on, as shown in Figure A.

Figure A
The ezClean dialog box provides you with detailed information on the metadata in a document.

The ezClean dialog box provides you with an abundance of information on the metadata contained in a document as well as several options for removing that metadata and saving the file. At the top-right of the ezClean dialog box, you’ll find the file save options, which go into effect as soon as the document is clean. By default, the Don’t Save option is selected, which means you’ll need to manually save the document after the clean operation. Of course, the Save option configures ezClean to save the document with the original name as soon as the clean operation is complete. If you want to keep the original document, you can choose the Save As option and then specify a new filename for the cleaned document.

The list box on the left shows the types of metadata found in the document. This particular document contains a good deal of potentially harmful metadata. The Track Changes feature is enabled, and there are 15 revisions that still need to be either accepted or rejected. You’ll also see that there are three versions stored in the document and five comments. In addition, the document contains an undo history, which can be used to reverse the most recent changes.

The list box on the right provides you with detailed information about each type of metadata you select. If you select an item from this second list, detailed information about that item appears in the panel below it. Keep in mind that anytime you need additional information or a description of an item, you can press [F1] to activate context-sensitive help. You can also click the Help icon and then select the item you need assistance with.

Once you’ve perused the information displayed in the ezClean dialog box, you can clean the document. To do so, just select the check boxes next to the metadata types that you want to remove and click the Clean button. As soon as you do, ezClean instantly removes the selected metadata from the document. You can confirm the operation simply by clicking the ezClean button to bring up the dialog box again.

Running ezClean from Outlook
If you chose to install the Outlook integration add-in, ezClean will be configured to automatically check each e-mail message you send for attached Word, Excel, or PowerPoint documents that contain metadata. Let’s take a look at how you go about configuring and using the Outlook ezClean add-in.

You can easily configure how you want the ezClean add-in to work from within Outlook’s Options dialog box, which you access by selecting the Options command from the Tools menu. You’ll select the KKL Software tab and then the ezClean tab, as shown in Figure B.

Figure B
There are several options for you to choose from when configuring ezClean to work from within Outlook.

As you'd expect, the default selection here is Check Outgoing Attachments For Metadata. With this option selected, ezClean will run automatically when a document containing metadata is detected as an e-mail attachment. If you select the Clean Detected Metadata Automatically check box, ezClean will automatically clean the metadata from the document using default configuration settings stored in the ezClean.ini file. You can find detailed information on editing the settings found in the ezClean.ini file in the supplemental Installation Guide And Admin Manual.

If you use Outlook to send both internal and external e-mail, you'll want to select the Check External Mail Only check box. When you do, ezClean will check the destination e-mail addresses for any document containing Office files. If the e-mail addresses are found to be Microsoft Exchange accounts, the message is considered internal and ezClean will ignore it. If the e-mail addresses are found to contain the @ symbol, the message is considered external and ezClean will go to work.

When ezClean detects a document containing metadata as an e-mail attachment, it will display a prompt, like the one shown in Figure C. In this instance, ezClean detected the Another VBScript Article.doc file as an e-mail attachment.

Figure C
When ezClean detects an Office document in an e-mail message, it will display this dialog box and allow you to decide how to proceed.

Compressed documents
Keep in mind that ezClean is unable to detect Office documents if they are packaged inside a compressed .zip file.

The default option here is to display the ezClean dialog box and allow you to manually clean the metadata from the document as I described. If you choose the Auto Clean option, ezClean will automatically clean the metadata from the document using default configuration settings stored in the ezClean.ini file. Of course, if you select the Skip option, ezClean will ignore the document. Once you’ve selected an option, just click OK to proceed.

Effective and easy
If you’re using features in Microsoft Word, Excel, or PowerPoint that add potentially harmful metadata to documents you send outside the enterprise, you should be concerned. Using ezClean as a solution to your metadata problems is both effective and easy.


Greg Shultz is a freelance Technical Writer. Previously, he has worked as Documentation Specialist in the software industry, a Technical Support Specialist in educational industry, and a Technical Journalist in the computer publishing industry.

Editor's Picks

Free Newsletters, In your Inbox