Data Centers

Seven steps to smooth data warehouse development

Learn seven steps to implementing a data warehouse.

For most IT Consultants, a data warehouse implementation is unlike any project previously undertaken. With the considerable differences in data structures, usage, and application development methods, the knowledge and experience you bring to the task is, largely, not applicable. But with a few minor adjustments in your approach, you can deliver on a warehouse implementation, even if it’s your first.

Take a look at the following list. Some of these steps you might not ordinarily think of, while some you probably take into account in a new implementation, but you can benefit from a slightly different perspective. The way to go is to keep an open mind about trying new approaches, and be creative in modifying your tried-and-true ways of working where necessary.

1. Rethink your approach to applications
A data warehouse isn’t about transactions, and plays only a small part in reporting. The essence of data warehousing applications is analysis—specifically analysis of business intelligence (BI). And BI isn’t data in the conventional sense: it is new data derived, often deductively, from patterns seen in old data. How do you create apps to dig it out? Well, you don’t, really—your clients do, and they kind of make it up as they go along. From a project management point of view, it may be the case that a skillful spreadsheet designer has an edge on you in how to put together these types of applications. Meet this challenge by letting go of the thinking that’s gotten you through in the past, and start focusing on how to view data in new ways—the way your clients are trying to do.

2. Create abstract, array-friendly database access components
Here’s a handle on the difference between the systems you’ve done in the past and the data warehouse: In Online Transaction Processing (OLTP), many users do lots and lots of small database reads; in Online Analytical Processing (OLAP), a handful of users do a few very big database reads. And it’s your job to write app components that will optimize this difference! Here’s a clue: In many of your analytic apps, you’ll be grabbing data items that are sequential—so you want to create and access data structures that keep that data in the same physical proximity. How? First, don’t normalize it. Second, store it in arrays that minimize the number of reads required to get it all. With such an approach, you’re sure to make a friend of the DBA, who’s going to need one.

3. Hold on loosely
Looking back to step one, you can imagine that analytic apps are not easily defined, and certainly not implemented in final form the first time around. And the same holds true of the data structures you’re going to be putting in place to accommodate these analytics. In short, there will be many, many changes. In a conventional implementation, you try to keep this to a minimum. In a data warehouse implementation, it’s essential to let this process run its course; and the DBA needs to be on board with this! Don’t hold on to your app designs, code, schemas, or anything else you create. It will change often, so go with it.

4. Put housekeeping first
How are you in analyzing data sources? Do you consider the patchwork of data-cleansing transforms to be drudgery? Well, you’re certainly not alone—most of us do. There is no duller work. However, part of the point of a data warehouse is to bring together vast seas of old data into a common facility, and all of this data must be consistent. So analyzing data sources and putting in many hours writing transform routines to clean up the data on its way into the warehouse is a critical part of the implementation. It would not be overstating to say that this is the most important single phase of the project; it can take up to three-fourths of the schedule and budget of the project. Make peace with it.

5. Listen between the lines
You’re about to sit through some of the toughest user interviews you’ll ever do. Why? Because these people don’t know what they want and won’t really be able to articulate their end product until they see it. Defining data warehouse applications is an exploratory process, and a very iterative one. Remember that the users themselves will define "business intelligence" and they’ll do it as they go. They are now the bridge between data and business processes. What they’re seeking is not data itself but intelligence hiding behind it. Let them talk, speculate, and get creative. Don’t push them to solutions; push them to think and speak freely about possibilities. And pay very close attention to the result.

6. Take the lead
If data warehousing seems to be a lot less grounded than conventional OLTP work, well, your assumption is correct. Lots of people are going to be jumping in, and since the framework is so different from the norm, it will look chaotic at times. But continuity is important. So, there are two side roles that are critical and you may be right for one of them.
  • First, there needs to be a technical lead. This role tracks the deployment and correct use of new software tools and development processes across all phases of the project. If this fits your background then volunteer and give it close attention.
  • Second, there is the architectural lead. This role ensures continuity is maintained within the physical and logical framework of the data warehouse and its support systems as the project moves through its very diverse phases. This, too, may be something you can offer.

7. Sound the alarm
Finally, remember that you are not the only one in uncharted territory. Everyone around you is carrying around one or more of the following: unrealistic expectations, misconceptions about the technology, old (bad) habits, competing agendas, or a lack of belief in the project. Diplomacy is really the province of the project manager, but you’re going to run into all of these things, as well. What should you do? Be honest with everyone around you, and above all, be willing to sound the alarm. When you see the scope creeping, the resources shifting, and people losing their way, speak up. Do it quickly and do it clearly. This is usually wise in any case, but in the past, politics may have sometimes intruded. In this context, you don’t have the luxury of allowing for them. A data warehouse can go way off the tracks in a hurry. Don’t let it take you off-track as well.

About Scott Robinson

Scott Robinson is a 20-year IT veteran with extensive experience in business intelligence and systems integration. An enterprise architect with a background in social psychology, he frequently consults and lectures on analytics, business intelligence...

Editor's Picks