Big Data

Conform facts and dimensions to evolve your data warehouse over time

Building an enterprise data warehouse can be a complex and risky undertaking. Some organizations prefer to build smaller data marts first and let their data warehouse evolve over time. We'll tell you how planning can ensure the success of this approach.

An enterprise data warehouse can prove essential to the success of your business. But despite its value, building one is a daunting project. It can take years and cost millions. To show faster results at less cost, some experts recommend the alternative approach of constructing smaller data marts, which can later be combined into an enterprisewide data warehouse. (See my previous article, “Data marts deliver fast results, but proceed with caution.”)

This second, bottom-up approach is favored by most enterprises. But to ensure the success of a data mart project, some brief, but careful, enterprisewide collaboration and planning must take place at the inception. Otherwise, the completed data marts will simply replicate the problems they were intended to resolve: data redundancy, islands of information, disparate systems and interfaces, arguments over who has the ”right numbers,” and a host of other troublesome issues.

This article outlines the essential up-front planning that enables the creation of functional data marts that will later serve as the components of an enterprisewide data warehouse.

The data warehouse bus architecture
Initial planning activities should focus on developing and documenting an overall “proper” architecture to which individual data marts will conform. If you do so, you’ll realize two key benefits:
  • Your individual data marts will integrate seamlessly into the fabric of the enterprise data warehouse.
  • You’ll realize benefits from the data marts quickly, and the marts will allow you to develop and evolve your enterprise warehouse over time.

Data warehouse bus architecture (DWBA), advocated by data warehouse expert Ralph Kimball, is the best architecture to follow. The term “bus” in this sense—conceptually akin to the expansion-bus in your computer—allows you to plug and play data marts like a plug and play PC video card. ”The enterprise data warehouse consists of separately implemented data marts bound together with a powerful architecture based on…conformed dimensions and conformed facts,” to quote Kimball. This sounds exactly like what we’re after, but what are “conformed dimensions and facts”?
Don’t miss the previous installments in this series:
Conformed dimensions
Conformed dimensions can be used to analyze facts from two or more data marts. Suppose you have a “shipping” data mart (telling you what you’ve shipped to whom and when) and a “sales” data mart (telling you who has purchased what and when). Both marts require a “customer” dimension and a “time” dimension. If they’re the same dimension, then you have conforming dimensions, allowing you to extract and manipulate facts relating to a particular customer from both marts, answering questions such as whether late shipments have affected sales to that customer.

Suppose now that you add a “marketing” data mart to help you analyze product promotions. Again, with conformed customer and time dimensions, you’re able to analyze the effects of a particular product promotion on sales. (Analyzing facts from more than one fact table in this way is termed “drilling across.” My previous article, “Thinking dimensionally aids business intelligence design and use,” explains the function of facts and dimensions.)

As this example shows, the very same conformed dimensions—in this case, time and customer dimensions—have meaning in the context of three independentlydeveloped data marts. These dimensions become enterprise property and can be used later in other marts as you evolve the enterprise data warehouse.

Conformed facts
In addition to conformed dimensions, you need conformed facts. Conforming a fact really amounts to standardizing the definitions of terms across individual marts. Often, different divisions or departments use the same term in different ways. Does “revenue” refer to “gross revenue” or “adjusted revenue”? Does “units shipped” refer to cases of items or individual items?

Make certain your design team develops, early on, a uniform enterprise taxonomy—and enforce it.

When marts can stay marts
Sometimes you don’t need to be so concerned with an architecture aimed at tightly integrating your marts, and a particular mart can just stand alone. Data marts constructed to focus analytical support to solve a single business problem are examples of this and are termed “disposable” data marts.

Here’s an example: A transporter of new automobiles was bearing the burden of costly vehicle damage claims. A data mart was built in order to gain insights into the causes for the claims—which routes were taken on which days and which parts of the cars were being damaged. As a result of the analysis, procedural changes reduced damage claims to acceptable levels and the mart was then discarded.

Remember these three key points when planning a bottom-up approach to data mart construction:
  • The bottom-up approach enables you to quickly realize the benefits of low-cost and easy-to-build data marts while the enterprise data warehouse evolves over time by combining the individual marts.
  • Using conformed dimensions and facts in your independent data marts allows you to combine them later. If you fail to conform dimensions and facts, your marts will be “stovepipe” marts, and you’ll have little hope of integration. You’ll wind up with data islands and disconnected information.
  • The data warehouse bus architecture forces compliance on these points.
Have you undertaken an enterprise data warehouse initiative? Does your experience include addressing some of the questions and issues discussed in this article? Share your thoughts. Send us a note or post a comment below.

Editor's Picks