Replacing a natural key with a surrogate key is structural refactoring method of replacing an existing natural key with a surrogate key.
Why replace a natural key with a surrogate key?
A natural key and a surrogate key are two types of primary key. A natural key is a single column or a combination of columns that has a business value and occurs naturally in the real world (e.g. Social security number, International Standard Book Number…). A surrogate key in SQL Server is created a by assigning an identity property to a column that has a number data type. A surrogate key is a value generated right before the record is inserted into a table.
There are several reasons to replace a natural key with a surrogate key. A surrogate key can be used to reduce coupling since it doesn’t have a business value and it’s not coupled with any external application connected to a database. For example, if the business logic changes at some point, that would require to update a natural key across all foreign key relationships. In some cases, even using Social Security Number or International Standard Book Number as a natural key doesn’t guarantee that a primary key will be unique. A surrogate key value is unique and since it doesn’t have a business value it won’t be updated over time.
Also, using a surrogate key may increase performance because a large natural key can degrade database performance, and due to a fact that surrogate keys are usually integer values a smaller index on a primary key will have better performance on JOIN operations. In case of combined natural primary keys JOIN operations can became very complex:
Introduce a Surrogate Key
In this example, the Account table with a combined primary key will be used:
CREATE TABLE Account ( AccountNumber int, AccountType varchar (6), AccountDescription varchar(20), PRIMARY KEY (AccountNumber, AccountType));
As it can be seen from the DDL script the table is referenced by a foreign key in the Client table:
CREATE TABLE Client ( AccountNumber int, AccountType varchar (6), FirstName varchar (10), LastName varchar (10), FOREIGN KEY (AccountNumber, AccountType) REFERENCES Account (AccountNumber,AccountType));
Writing JOIN statement for tables that include composite primary keys is getting more complex, depending on the number of columns combined in a primary key. Every child table has to include all primary key columns in the JOIN statement:
SELECT a.AccountNumber,a.AccountDescription, c.LastName, c.FirstName FROM dbo.Account a INNER JOIN dbo.Client c ON c.AccountNumber = a.AccountNumber AND c.AccountType = a.AccountType
If the composite primary key is replaced on the Account table with one key the JOIN statement will be more understandable:
SELECT a.AccountNumber,a.AccountDescription, c.LastName, c.FirstName FROM dbo.Account a INNER JOIN dbo.Client c ON c.AccountID = a.AccountID
To apply this refactoring method, the following needs to be done:
-
Drop the foreign key relationship with the current primary key from the Client table:
ALTER TABLE dbo.Client DROP CONSTRAINT FK__Client__73DA2C14 GO
-
Drop the primary key constraint from the Account table:
ALTER TABLE dbo.Account DROP CONSTRAINT PK__Account__BA555B37D93B2923 GO
-
Introduce the new key column as IDENTITY and PRIMARY KEY:
ALTER TABLE Account ADD AccountID int IDENTITY(1,1) PRIMARY KEY NOT NULL GO
-
Create the foreign key relationship with the new primary key by adding the AccountID column to the Client table:
ALTER TABLE Client ADD AccountID int FOREIGN KEY REFERENCES Account (AccountID) GO
-
Since AccountNumber and AccountType columns are no longer needed in the Client table they can be dropped:
ALTER TABLE Client DROP COLUMN AccountNumber GO ALTER TABLE Client DROP COLUMN AccountType GO
DLLs for Account and Client tables are now:
CREATE TABLE dbo.Account ( AccountNumber int NOT NULL, AccountType varchar(6) NOT NULL, AccountDescription varchar(20) NULL, AccountID int IDENTITY(1, 1) PRIMARY KEY NOT NULL ) CREATE TABLE dbo.Client( AccountID int FOREIGN KEY REFERENCES dbo.Account (AccountID) FirstName varchar (10) NULL, LastName varchar (10) NULL )
New relationship diagram:
Add surrogate key using ApexSQL Refactor
To replace a natural key with a surrogate key using ApexSQL Refactor, SSMS and VS SQL database refactoring tool, choose the table in the Object Explorer panel and from the Apex SQL Refactor menu choose the Add surrogate key command, under the Other refactors sub-menu:
Click the Preview button to see the generated script in the Add surrogate key window:
In the Dependencies and the Sequence tabs it can be seen the list of all dependent objects and the order in which changes will be made respectively. In the Warnings tab, it can be seen all referenced objects that require a user’s analysis before running the script.
Useful resources
Refactoring Databases – Evolutionary Database Design
Natural versus Surrogate Keys. Performance and Usability
Surrogate Key vs. Natural Key
February 20, 2014