Sunday, February 9, 2014

Snow Covered Wagon hitched to a Star (Dimensional Modelling)

Hello One and All,

In my last Blog we discussed about Dimensional Modelling and some of its components. Today we will go through different Schema that can be used during Dimensional Modelling to create a Data Warehouse.

Before we start with today's topic , For my viewers those who are new to this field i would like to revisit some of the key points of my previous blogs:

1)  Business Intelligence is mainly divided into three parts as per my understanding
                 a) Data Warehouse design and Implementation (ETL process)
                 b) Data Analysis (Using OLAP cubes)
                 c) Reporting and Dashboard Creation
For further details revisit my First blog 

2) Important Components involved in Dimensional Modelling or Data Warehouse Designing
                 a)  Fact Tables (Additive Facts, Semi-Additive Facts, Non- Additive Facts)
                 b)  Dimension Table
                 c)  Grain
For further details revisit my Second blog

After a thorough revision of previous concepts lets start our today's discussion about different Schema involved in Dimensional Modelling or Data Warehouse Designing.

First of all i would like to explain the meaning of the topic i.e
Snow Covered Wagon Hitched to a Star = SnowFlake Schema and Star Schema are two types of Schema that are used while designing a Data Warehouse, Hence they can be explained as follows:

Star Schema:  A Star Schema is one of the simplest and easiest schema to understand. A schema which consists of Dimension tables only attached to Fact tables. A Star Schema get its name from its physical representation where we have Fact table at the center with Dimension table surrounding it representing the star points. Star Schema can also be called as a denormalized schema as it contains Dimensional table in the lowest normal form.

Star Schema
Snow Flake Schema:   The Snowflake Schema can be formed by Snowflaking(Normalizing) the Star Schema . The Dimension table in a Snowflake Schema may or may not contain a parent table depending on its normal form. All the dimension tables in Snowflake schema are normalized to third normal form (thus dimension table does not contain any transitive dependencies).


SnowFlake Schema
Pros and Cons of both the Schemas are as follows:

Star Schema:
Pros: A Star Schema is very quick in response to ad-hoc queries as this kind of schema can answer maximum queries with minimum number of joins.
Cons:  A Star Schema suffers in space and data management as the data is redundant in dimension table as the tables are in their lowest normal forms.

Snowflake Schema: 
Pros:    A SnowFlake Schema is preferred for Space and Data Management as the Schema consists of table in third normal form.
Cons:   A SnowFlake Schema takes large amount of time in querying database as this type of design involves joins to answer a query and as joins are highly expensive.

Hence a choice between Schema is made as per business requirement, the system that requires faster query performance uses Star Schema whereas the one that requires Space Management uses SnowFlake Schema.

With this i conclude my blog about Schema. Stay tuned to learn more about different challenges and tasks involved while designing a Data Warehouse. In my coming blog we will talk about Surrogate Keys, Natural Keys, Durable Keys, Slowly Changing Dimensions Type and much more.

References:
The Data Warehouse Toolkit - Third Edition, Ralph Kimball and Margy Ross
http://en.wikipedia.org/wiki/Dimensional_Modelling

















No comments:

Post a Comment