Visualize complex data using Microsoft Data Analyzer

Do your clients' eyes glaze over when you present them with a spreadsheet full of data? Microsoft Data Analyzer allows you to give your clients an easily understandable picture of their organizations' data so they can identify trends.

About five years ago, I had a number of clients who asked for UNIX performance tuning. I would start a data capture script one day, and come back the next day to collect the data. Then it was off to my desk to dump the whole thing into Excel to produce some charts to show the client.

If I were to do this kind of assignment today, I'd use a new member of the Microsoft Office family: Microsoft Data Analyzer. Twenty-four hours of performance data is a lot to wade through in spreadsheet form. It's easy to miss a key trend, such as which applications correlate with certain bottlenecks. But this inexpensive data visualization tool lets you explore trends like these quickly and cost-effectively, and then build a great presentation.

I’ll show you how you can use Data Analyzer to explore data and touch on sharing the results with others.

First, some terminology:

Views, connections, and cubes
Data Analyzer stores data in multidimensional databases called cubes and displays it. To do so, it creates views into a given cube. The following are parts of the definition of the view:
  • A connection tells Data Analyzer where to find the data. Three types of connection are supported: to an Analysis Server on the network, to a Web server that provides HTTP access to an Analysis Server, or to a locally-stored OLAP cube file (filename.cub). Once defined, a connection may be used to create multiple views of that data source.
  • A cube is a multidimensional database. Each dimension is one subject by which the data can be summarized, such as region, time period, or sales territory. The values in a dimension are called its members, and are typically structured into a hierarchy of levels.
    For example, in a dimension called Time, each unique date is a member at the lowest level. At the next highest level, members are months that summarize all dates in that month. At the level above that, members are years. The top level of each dimension is a single member that groups all the data. The cube’s measures are numeric columns, which are summarized in the cube.
  • A view is a Data Analyzer file (filename.max). It identifies the connection and cube to be accessed, and the dimensions and measures to be displayed from that cube.
  • A dimension pane is a subwindow in a view, displaying the members of a single dimension. By manipulating the controls in the dimension pane, you explore the measures associated with each member. Each pane can be minimized to a tab at the bottom of the screen, or expanded into normal size.

Exploring the data
What makes Data Analyzer a powerful tool is the fact that all the dimension panes are linked to each other, much as cells in a spreadsheet can be linked together. When you select a member in one dimension, the other dimensions change as well. For example, if you select the year 2000 from the Time dimension, the other dimensions will limit themselves to data from that year.

Each dimension pane can display its contents as a bar chart (the standard view), a pie chart, or a data grid, as shown in Figure A.

Figure A

In a bar chart, the length of each bar is associated with one of the cube’s measures. You can easily see which products sold the most units by making the length track the Quantity Sold. Or, you could associate the cost or profitability with the length. The color of each bar is associated with a second measure.

Pie charts use length and color similarly. The size of a slice in the pie chart is determined by whichever measure is used for bar chart length, and the color of the slice uses the same measure for bar chart color.

Grid view simply displays the member names and all the measures in spreadsheet format.

Above each dimensional pane is a slider control that lets you affect the scale displayed in the pane, and to the left is a toolbar with buttons to control the pane’s display. Level controls let you drill down to a lower level in that dimension, drill up to a higher level, or reset the display to the default level for the dimension. Filter controls let you select which individual members you want to work with. Unselected members can either be hidden or displayed in a dimmed out color. Sorting controls display the members in the pane by length of bar, color of bar, alphabetical by member name, or in “natural” order (that which is predefined in the cube).

An example
When demonstrating Data Analyzer for clients, walk them through a sample scenario. Here’s an example using the sample Airline local cube that is included with Data Analyzer.

In the Region dimension, the length of each bar represents the revenue earned from flights in that region. The color of each bar represents the profitability of those flights on a scale from low (red) through middle (yellow/orange) to high (green).

Notice in Figure A that flights to Australia and Western Europe brought in the most revenue (longest bars). However, Western Europe flights were less profitable—orange vs. green for Australia—as shown in Figure B.

Figure B

If you double-click on Western Europe to drill down into that region, the bar chart will be replaced by another showing cities in the region. Most are profitable (green), but the two cities with the largest revenue, London and Paris, are not, as shown in Figure C.

Figure C

To include both in the filter, and exclude the others, press [Ctrl] and click on each city (see Figure D).

Figure D

Meanwhile, the other dimensions have been changing to follow your choices. Notice in the Type of Aircraft dimension that Jumbo jets are most often used for London and Paris (longest bars), but they’re not as profitable as the Super Jumbo (bright green), as shown in Figure E. Maybe it would be more profitable to fly Super Jumbo jets on those routes.

Figure E

To check the trend, click on 2001 in the Time dimension to select it. The two other dimensions change as well, but the trend is clear: Super Jumbo is more profitable on these runs.

Sharing the data
After you’ve spotted an important trend in the data, it’s time to share it with others. Data Analyzer can share its results several ways:
  • You can e-mail the view file to other users who have Data Analyzer installed. They can open the view, connect to the same cube, and see the same results.
  • For users who use Excel rather than Data Analyzer, you can export the view as either an Excel spreadsheet or a PowerPoint slide. The PowerPoint slide can be a standalone presentation, or be added to an existing presentation.
  • You can also save the data as an HTML page. Data Analyzer describes the data using XSL style sheets. You can customize the resulting page via style sheets to conform to client standards for presentation, and then publish them on an intranet or extranet Web server.

How to get Data Analyzer
Although Microsoft Data Analyzer is a part of the Office family, it isn’t included in any Office XP bundle. Like Microsoft Project 2002 and Microsoft Front Page 2002, it’s a separate application.

A single seat costs $179. Most large organizations can reduce that cost by purchasing multiple copies via Microsoft Open License or Microsoft Select volume buying programs. A demonstration copy on CD-ROM is available for $9.95 from Microsoft. The demonstration version is a complete version but is time-limited to 120 days.

Editor's Picks