Wednesday, February 26, 2014

Keys Locking Up The Sane Data

Hi All,

In the previous blog we discussed about different Schema in which a Data Warehouse can be designed. Now lets discuss some of the other important concepts which are essential in order to provide flexibility and scalability to a Data Warehouse.

Some of the important concepts regarding the keys in order to maintain data integrity are as follows:

Surrogate Key:
The advent of Surrogate key comes from the basics of maintaining temporal or historical data in the database. Distinction between a Surrogate Key and Primary Key depends on whether the database is Current Database or Temporal Database. Lets discuss Surrogate Key usability in two different types of database
       
Current Database:
Since current database stores only current data , hence there is a one to one relationship between Surrogate Key and Primary Key. In this case the Surrogate Key may be used as a primary key.
     
Temporal Database:
As temporal database stores historical and temporary database therefore there is a one to many relationship between Surrogate Key and Business key. Distinction between these two keys is highly important as Surrogate Key identifies one row in the table uniquely, whereas business key identifies a business entity in the business model uniquely. The table holds the life span of the entity in the modeled world. An example of a temporal table containing Surrogate Keys.


Advantages of Surrogate Key:
  • Application can not lose their reference to a row in the database.
  • Surrogate key tend to be a compact data type, hence they perform faster when queried
  • Surrogate helps in maintaining the uniformity of detail or an entity throughout the data

Natural Key:
A natural key is a key that is formed on the entities or attributes that already exists in the world. For example a table consisting of Genetic DNA makeup of a person can be considered as a Natural Key.

A Natural key is a candidate key that has a logical relationship to the attributes within that row.A Natural Key is sometimes called as a Domain Key. It ensures that there can only be one row for a Key.

Disadvantages of Natural Key:
We need to choose natural key cautiously as choosing a column that may change dynamically can lead to problems. For example if we choose name of a person as natural key and if the name of the person changes due to religion transformation, changing the name of a person in the database and reflecting the changes across all the table containing a foreign key will be an overhead for the database.


With this i would like to conclude my blog on "Keys Locking Up The Sane Data". Stay tuned to learn about Slowly Changing Dimensions, the next important step in creating a Coherent Dimensional Model.

References:
The Data Warehouse Toolkit - Third Edition, Ralph Kimball and Margy Ross
http://en.wikipedia.org/wiki/Dimensional_Modelling
http://en.wikipedia.org/wiki/Surrogate_key
http://en.wikipedia.org/wiki/Natural_key







No comments:

Post a Comment