Big Data

Understanding dimensions in an OLAP solution

Relational databases are by definition two-dimensional—rows and columns. Online analytical processing (OLAP) databases introduce additional dimensions, which you can think of as perspectives or vantage points. Discover why it's important to have a clear grasp of what dimensions are and how they're used in an OLAP solution.

TechRepublic's free SQL Server newsletter, delivered each Tuesday, contains hands-on tips that will help you become more adept with this powerful relational database management system. Automatically subscribe today!

In almost every small organization I've been with, and, surprisingly, in many medium-size organizations as well, there is no online analytical processing (OLAP) database. This may be in part because of a lack of human resources.

This article is primarily for developers who wear many hats, and who haven't split their database into two: online transaction processing (OLTP) and OLAP. I'll introduce the notion of an n-dimensional cube as a way of getting you to think about what your data might look like in an OLAP solution.

Consider your OLTP database and its most active tables (e.g., Sales, SalesDetails, Payments). Then, think about the foreign keys in each of these tables (e.g., SalesOfficeID, SalespersonID, ProductID). Each of these constitutes a dimension of your data. More complex dimensions exist (in particular, dates) yet they may not relate to foreign keys. Besides the division into year, month, and day, there are oddball divisions such as week, quarter, and even fiscal quarter. Sometimes the group of interest might be a double-foreign key (i.e., ProductGroupID, which resides in the Products table not the SalesDetails table).

Look at a column's values for repeats of interest. From this perspective, a column such as SalesID is of no interest since there are no repeats, whereas StoreID and ProductID are of interest since there will be many repeats.

Relational databases are by definition two-dimensional—rows and columns. OLAP databases introduce additional dimensions, which you can think of as perspectives or vantage points.

In Analysis Services 2000, everything revolves around a fact table, to which various dimension tables are related. (SQL Server 2005, formerly code-named Yukon, supports multiple fact tables, which is a significant improvement.) Analysis Services pre-computes the aggregates of interest for all the dimensions you define, providing virtually instantaneous retrievals for said aggregates—no matter how many dimensions you specify or how deep you want to drill down.

An effective use of Analysis Services depends on your understanding of dimensions and your appreciation of their use in combinations. You might create an n-dimensional cube whose dimensions include StoreID, ProductID, ProductGroupID, DateSold, and SalesPersonID. Once you describe your cube clearly, SQL Server will take a while to calculate all the values for all the points your dimensions describe (i.e., Total Sales for Q1 2004 by store, product, group, and salesperson). However, when managers want to know what the big picture is, you can tell them instantaneously.

The dimensions of interest will vary among organizations, but the concepts are the same across all organizations. Once you have a clear grasp of what dimensions are, you can deliver invaluable information to managers in a form that they can understand and appreciate.

Editor's Picks

Free Newsletters, In your Inbox