SQL code refactoring – ways to improve database performance (part II) – Split table

In most cases, splitting a table into two tables containing different columns is required in order to address database design changes, business requirements or even adding domain restrictions retroactively (for example, isolating currencies in a separate table and enforcing referential integrity via a foreign key to ensure that only valid currencies can be stored in the database). However, splitting tables may actually offer additional performance benefits to the database.

For instance, in scenarios where a table contains large, but rarely used fields, moving them to a separate table will increase performance as the frequently used data will be stored in a much smaller table, and the rarely used data will only be looked up when required. The impact on performance caused by the occasional joining will be compensated just by having SQL Server look up the data that’s used more often in a table which requires less disk space. In general terms, JOIN costs CPU time whereas large, flat tables cost disk space and disk access time. From a performance point of view, the disk is still orders of magnitude slower than memory and since large tables usually increase the number of logical reads, sacrificing some CPU cycles on that extra JOIN might significantly improve the overall database performance.

Additionally, in some cases table refactoring does wonders for the performance of the UPDATE queries. For example, in a scenario where a table holds the phone records of 10,000 customers from the same area, an area code change would result in having to update 10,000 rows. However, had the area code has been separated into a different table referencing the original one with a foreign key, the update would’ve affected just a single row.

Now, here comes the catch – splitting a table is rather easy when designing the database. However, once the database is populated and deployed, partitioning a table vertically becomes a whole new ball game. To split a table in this case, the following steps need to be taken:

  1. Create a new table which will contain the columns to be moved from the old one
  2. Determine the key candidates for the new table
  3. Based on the key candidates, create a primary key on the new table
  4. Determine whether the new table will depend on the old one or if it’s the other way around and create the appropriate foreign keys
  5. Depending on the columns to be moved to the new table:
    • Copy, move or split the indexes if they refer to columns shared between the old and the new table
    • Copy or move the table and column level extended properties and permissions
    • Review and copy or move the table level check constraints
    • Review and rewrite, delete or recreate triggers on the new and the old table
    • Copy the data to be stored in the new table’s columns
  6. Identify all old table’s dependencies
  7. Update the identified dependencies, so they either reference the columns which will remain on the old table, the ones that will be moved to the new one or a JOIN statement which will include both
  8. Make sure that the updates on the dependent objects are performed in the correct order (e.g. if a stored procedure references the old table and a view which will depend on the new table, make sure to update the view first and the stored procedure afterwards – not vice versa)
  9. Drop the appropriate columns from the original table

To add insult to injury, this process is not only tedious, as witnessed by the steps above; it’s very error-prone, particularly since missing a dependency due to SQL Server’s own shortcomings, can completely break the database down the road. So, is there any way to enjoy the performance benefits of splitting large tables without any of the risks to the integrity of the database?

Indeed. This is where ApexSQL Refactor can do the formatting itself.

ApexSQL Refactor is a SQL Server Management Studio and Visual Studio add-in which formats and refactors SQL code using 11 code refactors and over 160 formatting options. It expands wildcards, fully qualifies object names, renames SQL database objects and parameters without breaking dependencies and much more.

To safely split a table using ApexSQL Refactor:

  1. Select the table in either SQL Server Management Studio or Visual Studio
  2. In the ApexSQL menu, click ApexSQL Refactor
  3. From the Other refactors sub-menu select the Split table command:

  4. In the window that will appear:
    1. Specify the name and the schema of the new table
    2. Select the columns to be copied or moved to the new table, including the primary key of the old table
    3. Determine the relationship between the tables by specifying the one which will hold the foreign key

  5. Click the Preview button to preview the change SQL script, which will be executed in order to create the new table and modify the old one
  6. Select the Dependencies tab to preview the dependent objects, which will be updated automatically to reflect the change
  7. Click the Create script button to open script in the Query editor
  8. Click the Execute button to split the table

Bottom line, gaining database performance by splitting large tables doesn’t have to be potentially hazardous to the database functionality or even time-consuming as ApexSQL Refactor can split the tables, without risking to break any dependencies, in a single click.

 

April 4, 2013