Wednesday, February 5, 2014

Unraveling the Unravelable Dimensional Modelling

Hello All,

In my previous blog, we went through different concepts on a macro level. From this blog onward we will delve deeper and will understand each concept in detail.

First concept that we will talk about in this blog is about Dimensional Modelling.

What is Dimensional Modelling?
From a high level, Dimensional Modelling is the technique, concepts in designing a Data Warehouse. We build data warehouse with certain design methodology, technique  in order to analyse billions of rows in few seconds which is called as Dimensional Modelling.

Some of the key elements of Dimensional Modelling are as follows:

  1. Fact Table
  2. Dimension Table
  3. Grain


Fact Table:   A fact table is a simple  table that contains facts and dimensions. Facts means numerical values  that helps in making decision when aggregated or analysed across different dimensions for example profit in a transaction, cost paid by customer in a transaction. The facts are stored in the fact table corresponding to a particular set of dimensions. For example we store profit by product by store by time by customer type. Therefore profit here is the fact and product, store, time, customer type are dimensions.

Facts are of three different types:

Additive Facts: These are the facts when aggregated across dimensions have a meaningful outcome For example :  Profit

Semi-Additive Facts: These are the facts that can be aggregated across some dimensions but does not     have a meaningful outcome when aggregated across other dimensions. For example: To aggregate the stock level  of a product on a particular day, but aggregating it over a period wont tell us about the correct stock level value about the product.

Non- Additive Facts: These are the Facts that can not be aggregated across any dimension. For example: ratio, percentage etc.

Dimension Table: This table contains values which adds meaning to fact values.
The table that contains dimensions across which facts are sliced, diced , drilled down and drilled through. Dimensions helps in bringing out the values that are useful for business people. For Example Fact such as profit is useless until we do not know that profit value is by product by time by location and many more which will help in inferring useful information.

Grain:  The highest level of detail that your data contains is called grain. Grain helps in defining the level of questions that your dimensional model can answer, for Example lets consider our transcript. A transcript contains data at three levels i.e.
     1) GPA at Subject Level
     2) GPA at Semester Level
     3) GPA at Year Level

These are three  different levels of grain,  hence if we store data regarding subject we can calculate our GPA at Semester and Year level, and if we store our GPA regarding semester level we can roll it up to Year level but can not drill down to Subject level, Similar is the case for Year level.

Thus determining the grain is the most important design aspect of Dimensional Modelling.With this i conclude my blog on Unraveling the Unravelable Dimensional Modelling

Stay tuned to know more about dimensional Modelling. In my next blog i will discuss about Star Schema, Snow-Flake Schema, Business process and a glimpse into Market Basket Analysis.

References:
The Data Warehouse Toolkit - Third Edition, Ralph Kimball and Margy Ross
http://en.wikipedia.org/wiki/Dimensional_Modelling












No comments:

Post a Comment