At the core of data warehouse applications and OLAP (online analytical processing) is a specialized schema that relaxes the rules of Third Normal Form RDBMS schemas in favor of faster analysis and processing of large amounts of data. It's important to understand the difference between these kinds of schemas during application design or migration, rather than later when schema changes are more costly.
Most database developers encounter Third Normal Form schemas before any other schemas. The EMP/DEPT examples use Third Normal Form schemas as the basic pattern. In this form of database normalization, an object or entity is related to another object or entity by a primary key and foreign key on an individual basis. (The foreign key DEPTNO in the EMP table defines the relationship between EMP and DEPT tables.) Here is an example of a Third Normal Form database schema:
create table brands
(
id number primary key,
name varchar2(30) not null
);
create table products
(
id number primary key,
brand_id number not null,
name varchar2(30) not null,
foreign key (brand_id) references brands(id)
);
create table sales
(
product_id number,
when date,
how_many number,
foreign key (product_id) references products(id)
);
create table product_dims
(
id number primary key,
name varchar2(30) not null,
brand varchar2(30) not null
);
create table time_dims
(
id number primary key,
when date
);
create table sale_facts
(
product_id number not null,
time_id number not null,
how_many number,
foreign key (product_id) references product_dims(id),
foreign key (time_id) references time_dims(id)
);
Star schemas do have some inherent problems. For instance, the central fact table can grow very large, with an upper limit of the product of the number of rows in each dimension table. Also, the dimension tables are no longer normalized, so they are larger and harder to maintain with lots of duplicate data.
In my example above, brand was denormalized and needs to be maintained separately. To go back toward normalization a bit, there is also a snowflake schema, which is a star schema with some of the features of a Third Normal Form data. Here is the same oversimplified example, but in a snowflake schema:
create table brands2
(
id number primary key,
name varchar2(30) not null
);
create table product_dims2
(
id number primary key,
brand_id number not null,
name varchar2(30) not null,
foreign key (brand_id) references brands2(id)
);
create table time_dims2
(
id number primary key,
when date
);
create table sale_facts2
(
product_id number not null,
time_id number not null,
how_many number,
foreign key (product_id) references product_dims2(id),
foreign key (time_id) references time_dims2(id)
);
The problem with snowflake schemas is that they may start to show signs of the performance problems of Third Normal Form queries. I don't recommend using snowflake schemas with Oracle's cost-based optimizer.
Scott Stephens worked for Oracle for more than 13 years in technical support, e-commerce, marketing, and software development. For more of his Oracle tips, visit our Oracle Dev Tips Library.






