database-designnormalizationolapdatabase-normalizationoltp

Normalization in OLAP and OLTP


Despite going through books and articles, I am not able to find a concrete answer for the below.

What should be the minimum and maximum degree of normalization for OLAP and OLTP?

I presume, the minimum for OLTP is 3rd Normal form and the maximum for OLAP is 2nd Normal form.

Can we please have details to supplement the answers?


Solution

  • Some of OLAP takes the form of data warehouses, data marts, or reporting databases. One design takes the form of a star schema, introduced by Ralph Kimball. In a star schema, the fact tables are sometimes in 1NF while the dimension tables are generally in 2NF. You can decompose dimension tables so that they are in 3NF or even BCNF, and you end up with a snowflake schema.

    In general, star schema design is a tight enough discipline so that you really don't think in terms of either normalization or denormalization, but rather in terms of good star design. Naturally, departures from normalization result in update anomalies, and that complicates life for your ETL processing.

    Star schemas are really a way of representing what Kimball called a Dimensional Model on an SQL platform. These turn out to be useful for data marts and reporting databases as well as data warehouses.

    William (Bill) Inmon, another innovator in data warehouses, advocated keeping to 3NF even when doing OLAP. Both Kimball and Inmon are smarter than I am, and I'm not going to try to tell you which one is right. When I've done OLAP, I've tended to follow Kimball, with good results.