Five applications for parsing big data

Brien Posey presents five apps he believes can help you mine valuable information from your organization.

Editor's note: Big data is a hot topic in information technology circles these days, but it continues to be an elusive concept for many. In the first entry in the TechRepublic Big Data Blog, John Weathington defines big data with these base principles:

"Big data is the massive amount of rapidly moving and freely available data that potentially serves a valuable and unique need in the marketplace, but is extremely expensive and difficult to mine by traditional means."

The "difficult to mine" part of that definition is the part Brien Posey is addressing with his list of five apps that you can use to mine valuable information from the big data being produced by your organization.

Five Apps

1. Email2DB

Email2DB is a data mining utility that is designed to extract information from E-mail messages and then use the extracted data to update a database.

This application has a number of different uses. One potential use is to compile a database of potential customers by analyzing E-mail messages that have been sent to your organization. The really nice thing about this application is that it allows you to specify triggers and actions. This makes it possible for the software to not only extract data from E-mail messages, but also to react to the contents of the message. These abilities could be used to manage mailing list memberships or even to respond to E-mail based orders.

Email2DB sells for $595, but a free 30-day trial is available for download.

2. Log Parser

Log Parser is a free command line utility for Windows that allows you to perform queries against a variety of file types including things like log files, CSV files, and XML files. This utility can even parse data sources such as the Active Directory or the Windows Event Logs.

Log Parser is extremely flexible, but it is not a utility for novices. Using Log Parser requires experience with custom queries as well as with working from the command line.

3. Log Parser QL

Log Parser QL is a free utility for parsing CSV files or other types of delimited files. When a file is open in LogQL, the software will display a description of the file's fields. Once the fields are known, you can use SELECT statements to extract specific data from the file. This data can be viewed on screen or it can be saved to either an alternate CSV file or to an HTML file. It is worth noting, that you must install Java in order to run this utility.

4. Data Parse Free Edition

Data Parse Free Edition is a free Java-based utility for analyzing data. Although this utility is very flexible, it is not for the faint of heart.

The utility uses a scripting language to parse data. The good news however, is that the scripting language is relatively intuitive, and a tree view presented within the Solution Explorer section of the interface helps you to adapt the script to the data that you want to analyze.

5. ParseRat File Parser

ParseRat File Parser is a utility for converting and restructuring data. The utility is able to read data from a number of different formats such as binary files, dBase, CSV, tab delimited, and more. Once ParseRat has been connected to the data source, it can be used to extract data and reformat data before rewriting it to a new data file.

ParseRat's is able to do more than just filter records. It can actually restructure data. For example, it can change a name from last name, first name format to first name last name format. The software can also add titles such as Mr. or Ms.

Although the software seems to work well, it is an older program and was apparently a victim of the screen resolution limitations of the time when it was published. As such, ParseRat's display is a bit cluttered. Fortunately, the interface has an enlarge function that was originally intended to help the visually impaired, but that works great for modern, high resolution displays.

ParseRat File Parser sells for $49.95, but a free trial version is available for download.

About Brien Posey

Brien Posey is a seven-time Microsoft MVP. He has written thousands of articles and written or contributed to dozens of books on a variety of IT subjects.

