SQL Refactoring techniques – How to split a table in SQL

Splitting a table by moving set of columns into a new table is called vertical partitioning. Horizontal partitioning is having different tables with the same columns but contain different (distinct) sets of rows

Why split a table in SQL?

Most often, the reasons for splitting a table vertically are performance related and/or restriction of data access. Having a table where most of the external applications access one set of data more often (e.g. persons’ name, SSN etc.) while other data (e.g. person’s picture) is required less often you can improve performance by splitting the table and move the less accessed columns into another table. This will improve the time to retrieve data from a table, especially in cases of applications that select all columns from a table. Another example of vertical table partitioning is restricting access to some information within a table (e.g. salary, password, login information etc.). By splitting a table and moving columns you want to protect you can assign different access rights to a table that contains sensitive data

Vertical table splitting

We will show and example of vertical partitioning of the Employee table to restrict access to the Salary column

The current structure of the Employee table is shown here:

To implement split table SQL refactoring you need to do the following:

  1. Introduce a new table using CREATE TABLE. You may also find it useful to use an existing table that may be suitable to just move the columns into

    Here, we will introduce a table EmployeeInfo with information about the employee’s salary and the login details:

    CREATE TABLE EmployeeInfo
    (
    LoginInfo varchar(20) PRIMARY KEY,
    Salary varchar (20)
    )
  2. Copy data from the Employee table to the EmployeeInfo table

    INSERT INTO EmployeeInfo (LoginInfo, Salary)
    SELECT DISTINCT  LoginInfo, Salary
    FROM Employee

  3. In the transition period, in the case of a multi -application environment, you’ll need to introduce triggers on the Employee table to keep the columns that belong to a common table synchronized. The triggers must be invoked by any change to the Employee table.

    Triggers on the Employee table:

    CREATE TRIGGER TriggerOnEmployee
       ON Employee
       AFTER INSERT, UPDATE
    AS 
    BEGIN
       
          INSERT INTO dbo.EmployeeInfo
          (
     
              LoginInfo,
              Salary
          )
    SELECT i.LoginInfo, i.Salary FROM INSERTED i
    END
    
    CREATE TRIGGER TriggerOnEmployeeDelete
    ON dbo.Employee
    AFTER DELETE
    AS
    DELETE FROM dbo.EmployeeInfo
    WHERE LoginInfo IN
    ( SELECT d.LoginInfo FROM DELETED d)
    GO

  4. Create the referential relationship between tables:

    ALTER TABLE Employee
    ADD CONSTRAINT fk_Login FOREIGN KEY ( LoginInfo )  
    REFERENCES EmployeeInfo ( LoginInfo ) 
    GO

  5. Drop the Salary columns from the Employee table:

    ALTER TABLE Employee
    DROP COLUMN Salary
    GO

The partitioned table will have the following structure:

You’ll also need to rewrite all procedures, views and dependent objects, as well as external applications that access and reference the table to reflect the changes. If there are queries that involve columns from both new tables the query processing engine would require joining two partitions of the tables to retrieve data. In this example, we have the EmployeeView view that retrieves all columns from the Employee table, and we’ll rewrite this view using INNER JOIN to get the data from the both tables:

Before:

CREATE VIEW EmployeeView
AS  
SELECT  * 
FROM    dbo.Employee 

After:

ALTER VIEW dbo.EmployeeView 
AS  SELECT  Employee.EmployeeID, 
	        Employee.FirstName, 
		Employee.LastName, 
		Employee.LoginInfo,
	        EmployeeInfo.Salary
         
    FROM    ( dbo.Employee 
              INNER JOIN dbo.EmployeeInfo 
                  ON Employee.LoginInfo = EmployeeInfo.LoginInfo
            ) 
GO

Using ApexSQL Refactor an SSMS and VS refactoring add-in with database refactoring features you can split a table using the Split Table feature. Choose a table from the Object Explorer pane and from the ApexSQL Refactor main menu, under the Other refactors sub-menu select the Split Table feature

We will use the Employee table again.

In the Split Table dialog you need to enter a name for a new table you’re creating. We will name it EmployeeInfo like in the previous example

Also, you need to specify whether you want a foreign key to be created on a primary or on a secondary table. In this example, it determines that this table split is for one-to-one relation, and we’ll set up the foreign key on the primary table

You need to move columns you want on the other partition. One column needs to be copied from the primary table to keep the referential integrity of partitioned tables. We will copy the LoginInfo for referential integrity, and move the Salary column

ApexSQL Refactor offers a choice to choose the type of the JOIN statement. As we previously mentioned this is needed if you have queries that retrieve data from all columns which will now be partitioned in two tables:

When you set up the partitioning, click on the Preview to see the created script:

The Warnings section will inform you of any potential issues due to splitting a table:

All dependencies will be listed in the Dependencies section, which in this case is the EmployeeView view that retrieves all columns from the table:

The Sequence section will give you the exact steps in the process of splitting your table which will show you the refactoring actions in order in which they occur:

To see and edit the generated script in the SSMS query window click the Create script button. You can see that a script in the query window contains all the steps needed to automatically split a table and update all dependencies:

Useful resources:
Partitioning
Vertical Partitioning as the way to reduce IO
Top 10 steps to optimize data access in SQL Server: Part V (Optimize database files and apply partitioning)

March 10, 2014