Kuldeept’s Weblog

Just another WordPress.com weblog

  • Categories

  • Archives

Archive for September, 2008

Surrogate Key

Posted by kuldeept on September 20, 2008

Surrogate key is a substitution for the natural primary key.

It is just a unique identifier or number for each row that can be used for the primary key to the table. The only requirement for a surrogate primary key is that it is unique for each row in the table.

Data warehouses typically use a surrogate, (also known as artificial or identity key), key for the dimension tables primary keys. They can use sequence generator, or Oracle sequence, or SQL Server Identity values for the surrogate key.

It is useful because the natural primary key (i.e. Customer Number in Customer table) can change and this makes updates more difficult.

Let me give you a simple example:

Employee ‘E1’ belongs to Business Unit ‘BU1’ (that’s what would be in your Employee Dimension). This employee has a transactions on the Business Unit ‘BU1’ But later on the Employee ‘E1’ is shifted from Business Unit ‘BU1’ to Business Unit ‘BU2.’ All the transactions have to belong to the new Business Unit ‘BU2’ but the old one should Belong to the Business Unit ‘BU1.’

If you used the primary key ‘E1’ for your employee within your data warehouse everything would be allocated to Business Unit ‘BU2’ even what actually belongs to ‘BU1.’

If you use surrogate keys, you could create a new record for the Employee ‘E1’ in your Employee Dimension with a new surrogate key.

This way, in your fact table, you have your old data with the SID of the Employee ‘E1’ + ‘BU1.’ All new data would take the SID of the employee ‘E1’ + ‘BU2.’

You could consider Slowly Changing Dimension as an enlargement of your natural key: natural key of the Employee was Employee Code ‘E1’ but for you it becomes Employee Code + Business Unit – ‘E1’ + ‘BU1’ or ‘E1’ + ‘BU2.’ But the difference with the natural key enlargement process, is that you might not have all part of your new key within your fact table, so you might not be able to do the join on the new enlarge key -> so you need another id.

Posted in Data Warehouse | Tagged: | 1 Comment »