Data Management

Multi-dimensional modeling in Oracle SQL


A spreadsheet is often used as an example of a relational table. Its rows and columns are similar to the rows and columns in a database table. At the intersection of each row and column is a value.

The difference is that, in spreadsheets, each cell can also contain a formula that calculates the value of that cell. The formula can be totally different in each cell. In relational tables, only values are stored. You can use calculations in queries, but those calculations apply equally to all rows in the query.

Starting in Oracle 10g, you can treat the output of your query as if it were a spreadsheet and perform calculations upon individual cells of the output. You can even generate new rows of output that weren't in the original table. The MODEL clause of the SELECT statement identifies which columns to aggregate (measures), which columns serve as unique array indexes (dimensions), and which formulas calculate the values (rules).

Listing A shows a query that creates subtotals from data in the Order Entry (OE) sample schema. Examining the output, I see that sales were relatively small before 1998, and I'd like to roll them into the 1998 total and eliminate them from the report. Listing B uses the MODEL clause to do this.

DIMENSION BY indicates which columns will serve as indexes into the virtual spreadsheet. There are two: order_mode (online or direct) and order_year. MEASURES indicates which column will provide values for the cells. In this case, it's order_total, which I rename "tot" for ease of typing.

RETURN UPDATED ROWS eliminates any rows that our rules don't update -- this is a way of getting rid of the years earlier than 1998 after we process them.

The RULES list indicates how to calculate the cells. Our model has two rules:

Tot[ANY, 1998] = SUM(tot)[CV(order_mode),order_year <=1998]

Tot[ANY, order_year>1998] = tot[CV(order_mode), CV(order_year)]

The first rule says that for 1998 set the total to the sum of all years up to and including 1998. The keyword ANY performs this calculation for both order modes using a single rule. The second rule says that for years after 1998 just use the existing total (current value, or CV, is used to cover all order modes and all years).

Using rules, you can access all rows in the query at any time simply by identifying which combinations of indexes uniquely identify the array cell. Some example applications are:

  • Force the results of particular rows to a specific value.
  • Create new rows (forecasts) from existing ones by applying a formula.
  • Set the result of a row to the maximum, minimum, or average of a set of rows.

Although this small example used only two dimensions, SQL models can be much more complex. The MODEL clause has many options for calculating data beyond what is shown here. For more information on MODEL, see the Oracle Database Data Warehousing Guide, Chapter 22, SQL for Modeling.

Bob Watkins (OCP, MCITP, MCDBA, MCT) is a computer professional with 25 years of experience as a technical trainer, consultant, and database administrator. He is a Senior Consultant and Managing Partner at B. Watkins, a database consulting and training firm in the Dallas / Fort Worth area. Visit Bob's site.

--------------------------------------------------------------------------------------------------

Get Oracle tips in your inbox

TechRepublic's free Oracle newsletter, delivered each Wednesday, covers automating Oracle utilities, generating database alerts, solving directed graph problems, and more. Automatically subscribe today!

0 comments