How to format SQL like a pro – formatting to implicit Microsoft standards and guidance

There is little formal guidance regarding SQL formatting and coding styles, but there is no universally accepted coding standard for SQL Server. In this article, though, implicit guidance will be followed from:

  • MSDN
  • SQL Server 2012 Books Online documentation
  • and the Adventure Works 2012 SQL Server database

This article will describehow to implement these standards via ApexSQL Refactor.

ApexSQL Refactor is a SQL Server Management Studio and Visual Studio add-in and SQL formatter with nearly 200 formatting options.

Use statement terminator

Standard: SQL Server 2012 Books Online documentation

SQL Server requires the semicolon only in particular cases:

  1. To terminate the statement prior to a WITH clause defining a Common Table Expression (CTE)
  2. To terminate the statement prior to a SEND or RECEIVE service broker statement
  3. To terminate the MERGE statement

For example, not using semicolon to terminate the MERGE statement will result with an error:

Msg 10713, Level 15, State 1, Line 17
A MERGE statement must be terminated by a semi-colon (;).

Books Online for SQL Server 2012 documentation indicates that not terminating T-SQL statements with a semicolon is a deprecated feature, and that it will not be supported in a future version of SQL Server.

ApexSQL Refactor has an option for SQL coding style, which adds the semicolon at the end of each statement. Under the ApexSQL Refactor menu, in the Options window under the Formatting tab, check the Always use statement terminator option:

This option will terminate all statements with a semicolon. For example the following code:

CREATE FUNCTION [dbo].[ufnGetDocumentStatusText](@Status [tinyint])
RETURNS [nvarchar](16) 
AS 

BEGIN
    DECLARE @ret [nvarchar](16);

    SET @ret = 
        CASE @Status
            WHEN 1 THEN N'Pending approval'
            WHEN 2 THEN N'Approved'
            WHEN 3 THEN N'Obsolete'
            ELSE N'** Invalid **'
        END
    
    RETURN @ret
END

Will be formatted with the statement termination:

CREATE FUNCTION [dbo].[ufnGetDocumentStatusText](
                           @Status [tinyint])
RETURNS [nvarchar](16)
AS
BEGIN
       DECLARE @ret [nvarchar](16);

       SET @ret = CASE @Status
		        WHEN 1 THEN N'Pending approval'
			WHEN 2 THEN N'Approved'
			WHEN 3 THEN N'Obsolete'
			ELSE N'** Invalid **'
			END;

       RETURN @ret;
END;

Indentation

Standard: SQL Server 2012 Books Online documentation and AdventureWorks2012 database

Here, 4 spaces for indenting following the AdventureWorks2012 database indenting will be used.

There are three different styles of SQL code indenting in SSMS, and it can also be specified how many spaces compose a single indentation or tab.

In SSMS, under the Tools menu, click the Options command. In the Text Editor drop-down list choose Transact-SQL and Tabs:

When the None option is selected, the new line is not indented on ENTER.

When the Block option is selected, the new line created on ENTER is automatically indented the same distance as the previous line. The Smart option is unavailable for T-SQL.

The Tab size option sets the distance in spaces between tab stops. The Indent size option sets the size in spaces of an automatic indentation.

The Insert spaces option indent operations inserting only space characters. If Indent size is set to 4, then four space characters are inserted whenever the TAB key is pressed or the Increase Indent button is clicked on the toolbar in the main SSMS window.

When the Keep tabs option is selected, indent operation inserts for each tab character the number of spaces specified in Tab size.

In ApexSQL Refactor, there are several SQL coding style options for indentation. In the Options window, under the Formatting tab options set the general indentation rule. The Indent using tabs option equal to 4 spaces will be set:

Joins

Standard: AdventureWorks2012 database

A JOIN operator operates on two input tables. There are three fundamental types of joins – cross joins, inner joins, and outer joins. Each type applies a different set of phases.

A CROSS JOIN applies only one phase — Cartesian Product. A Cartesian product is a mathematical operation which returns a product set from multiple sets. For sets A and B, the Cartesian product A × B is the set of all ordered pairs (a, b) where a ∈ A and b ∈ B. That is, each row from one input is matched with all rows from the other. A table can be created by taking the Cartesian product of a set of rows and a set of columns

An INNER JOIN applies two logical query processing phases— a Cartesian product between the two input tables as in a cross join, and then it filters rows based on a specified predicate in the ON clause.

An OUTER JOIN applies three phases – the two logical processing phases that inner joins apply (Cartesian product and the ON filter), a phase Adding Outer Rows. Third phase identifies the rows from the preserved table that did not find matches in the other table based on the ON predicate and adds those rows to the result table produced by the first two phases of the join, and uses NULL marks for the attributes from the non-preserved side of the join.

To achieve formatting style following the Adventure Works 2012 SQL database, in the Options window under the Joins tab, the following options will be chosen:

This will format the following SQL code:

FROM [BOM_cte] cte INNER JOIN [Production].[BillOfMaterials] b 
ON b.[ProductAssemblyID]
= cte.[ComponentID]
      INNER JOIN [Production].[Product] p 
 ON b.[ComponentID] = p.[ProductID]

In SQL code just as it is in AdventureWorks2012 SQL database:

FROM [BOM_cte] AS cte
    INNER JOIN [Production].[BillOfMaterials] AS b
    ON b.[ProductAssemblyID] = cte.[ComponentID]
    INNER JOIN [Production].[Product] AS p
    ON b.[ComponentID] = p.[ProductID]

ApexSQL Refactor also has an option to format the FROM clause. In the Options window, under the Statements tab choose the Move FROM clause to new line option and indent 0 spaces:

Setting this option will additionally help achieving AdventureWorks2012 database formatting and the following SQL statement:

CREATE PROCEDURE [dbo].[uspGetAddressInfo] @City nvarchar(30)
AS
BEGIN
SELECT * FROM AdventureWorks2012.Person.Address
WHERE City = @City
END
GO

Will be formatted by AdventureWorks2012 standard:

Will be formatted by AdventureWorks2012 standard:
CREATE PROCEDURE [dbo].[uspGetAddressInfo]
	  @City nvarchar(30)
AS
BEGIN
       SELECT
	      *
       FROM AdventureWorks2012.Person.Address
       WHERE City = @City;
END;

The same formatting for the WHERE clause, and format the nested SELECT statements will be choosen.

Column lists in AdventureWorks2012 SQL database are formatted by placing a column list on a new line, and placing a comma before column names to facilitate code commenting. In the Options window, under the Lists tab and under the Columns sub-tab, the formatting rules for the Column lists option which will format the following SQL statement will be set:

CREATE VIEW [Purchasing].[vVendorWithContacts] AS 
SELECT  v.[BusinessEntityID],v.[Name],ct.[Name] AS [ContactType]  ,p.[Title] 
    ,p.[FirstName] 
    ,p.[MiddleName] 
    ,p.[LastName] ,p.[Suffix] 
    ,pp.[PhoneNumber],pnt.[Name] AS [PhoneNumberType]
    ,ea.[EmailAddress] ,p.[EmailPromotion] 

In accordance with the AdventureWorks2012 SQL database standard:

CREATE VIEW [Purchasing].[vVendorWithContacts] AS 
SELECT 
    v.[BusinessEntityID]
    ,v.[Name]
    ,ct.[Name] AS [ContactType] 
    ,p.[Title] 
    ,p.[FirstName] 
    ,p.[MiddleName] 
    ,p.[LastName] 
    ,p.[Suffix] 
    ,pp.[PhoneNumber] 
    ,pnt.[Name] AS [PhoneNumberType]
    ,ea.[EmailAddress] 
    ,p.[EmailPromotion] 

Aliases

Standard: MSDN site and AdventureWorks2012 database

The readability of a SELECT statement can be improved by giving a table an alias. A table alias can be assigned with or without the AS keyword. In the FROM statement an alias can be used to distinguish a table or a view in a self-join or sub query and usually is a shortened table name. If the same column name exists in more than one table in a JOIN statement, SQL Server requires the column name to be qualified by a table name, view name, or alias. If an alias is defined the table name cannot be used.

For example, in the following SQL code alias for the table source of the StateProvince column is sp:

CREATE VIEW [dbo].[vw_NewYork]
AS
SELECT p.LastName, p.FirstName, e.JobTitle, a.City, sp.StateProvinceCode
FROM HumanResources.Employee e
       INNER JOIN Person.Person p
       ON p.BusinessEntityID = e.BusinessEntityID
    INNER JOIN Person.BusinessEntityAddress bea 
    ON bea.BusinessEntityID = e.BusinessEntityID 
    INNER JOIN Person.Address a 
    ON a.AddressID = bea.AddressID
    INNER JOIN Person.StateProvince sp 
    ON sp.StateProvinceID = a.StateProvinceID
WHERE a.City = 'Seattle'

If, instead of an alias, we try to execute SQL code using the actual table name Person:

CREATE VIEW [dbo].[vw_Seattle]
AS
SELECT p.LastName, p.FirstName, e.JobTitle, a.City, sp.StateProvinceCode
FROM HumanResources.Employee e
       INNER JOIN Person.Person p
       ON p.BusinessEntityID = e.BusinessEntityID
    INNER JOIN Person.BusinessEntityAddress bea 
    ON bea.BusinessEntityID = e.BusinessEntityID 
    INNER JOIN Person.Address a 
    ON a.AddressID = bea.AddressID
    INNER JOIN Person.StateProvince sp 
    ON Person.StateProvinceID = a.StateProvinceID
WHERE a.City = 'Seattle'

The execution of the SQL query will result with an error:

Msg 4104, Level 16, State 1, Procedure vw_Seattle, Line 12
The multi-part identifier “Person.StateProvinceID” could not be bound.

In ApexSQL Refactor, the Alias option can be set under the Statements tab:

And the SQL statement:

CREATE VIEW [Purchasing].[vVendorWithContacts]  
SELECT  v.[BusinessEntityID],v.[Name],ct.[Name]  [ContactType]  ,p.[Title] 
    ,p.[FirstName] 
    ,p.[MiddleName] 
    ,p.[LastName] ,p.[Suffix] 
    ,pp.[PhoneNumber],pnt.[Name]  [PhoneNumberType]
    ,ea.[EmailAddress] ,p.[EmailPromotion] 

Will be formatted by AdventureWorks2012 standard:

CREATE VIEW [Purchasing].[vVendorWithContacts]  
SELECT
       v.[BusinessEntityID]
  , v.[Name]
  , ct.[Name] AS [ContactType]
  , p.[Title]
  , p.[FirstName]
  , p.[MiddleName]
  , p.[LastName]
  , p.[Suffix]
  , pp.[PhoneNumber]
  , pnt.[Name] AS [PhoneNumberType]
  , ea.[EmailAddress]
  , p.[EmailPromotion]

Capitalization

Standard: MSDN site

According to MSDN Transact-SQL syntax conventions, UPPERCASE should be used for SQL keywords and built-in functions. Types should be lowercase:

The formatting of identifiers and variables will be leave as is, but note that it is not a good practice to use reserved keywords as identifiers, and that consistency in SQL objects naming should be preserved using either Proper case or Camel case and always using regular identifiers, for example, ContactType instead of Contact Type.

Schema Statements

Standard: AdventureWorks2012 database

Following AdventureWorks2012 SQL database formatting standard in creating stored procedures, in the Options window, under the Lists tab and under the Parameters sub-tab, format the Parameters option:

Using this option, the AdventureWorks2012 database formatting style for stored procedure parameters from this SQL code will be achieved:

CREATE PROCEDURE [dbo].[uspGetWhereUsedProductID] @StartProductID [int],
@CheckDate
[datetime]
AS
BEGIN

To properly formatted SQL statement:

CREATE PROCEDURE [dbo].[uspGetWhereUsedProductID]
       @StartProductID [int],
       @CheckDate [datetime]
AS
BEGIN

Expressions

Standard: MSDN site and AdventureWorks2012 database

ApexSQL Refactor offers SQL formatting options for formatting logical, comparison and arithmetic operations.

Logical operators – test for the truth of some condition and return a Boolean data type with a value of TRUE, FALSE, or UNKNOWN.

Arithmetic operators -used to perform mathematical operations on multiple expressions and can be used with numeric data type.

Comparison operators – test whether two expressions are the same and can be used on all expressions except the text, ntext, or image data types.

In the Expressions tab under the Logical sub-tab in the Options window, set the Logical operations option under the Show operations on multiple lines group, following the formatting of the Adventure Works 2012 database:

This option will format the following SQL code:

FROM [Production].[BillOfMaterials] b
            INNER JOIN [Production].[Product] p 
            ON b.[ProductAssemblyID] = p.[ProductID] 
        WHERE b.[ComponentID] = @StartProductID 
AND @CheckDate >= b.[StartDate] AND 
@CheckDate <= ISNULL(b.[EndDate], @CheckDate)
        UNION ALL

According to AdventureWorks2012 SQL database standard:

FROM [Production].[BillOfMaterials] b
     INNER JOIN [Production].[Product] p 
     ON b.[ProductAssemblyID] = p.[ProductID] 
WHERE b.[ComponentID] = @StartProductID 
    AND @CheckDate >= b.[StartDate] 
    AND @CheckDate <= ISNULL(b.[EndDate], @CheckDate)
UNION ALL

Flow control

Standard: MSDN site

To indicate that a statement covers more than one line using the BEGIN/END combination makes code easier to read because it clearly indicates the beginning and end of the statement. Generally, the BEGIN/END statements are used in IF/ELSE loops and in the WHILE loops, but they are also used in stored procedures and IF statements by MSDN standard.

To enforce wrapping up of the stored procedures with the BEGIN/END statements combination in the Options window, under the Flow control tab select the Always use BEGIN and END in the IF statements and the Always use BEGIN and END in stored procedures options:

This way all stored procedures and IF statements will be formatted with the BEGIN/END statement, and writing the CREATE PROCEDURE statement:

CREATE PROCEDURE [dbo].[uspGetAddressInfo] @City nvarchar(30)
AS
SELECT * FROM AdventureWorks2012.Person.Address
WHERE City = @City

GO

Will wrap up the statement with the BEGIN/END statement:

CREATE PROCEDURE [dbo].[uspGetAddressInfo]
	  @City nvarchar(30)
AS
BEGIN
       SELECT
	      *
       FROM AdventureWorks2012.Person.Address
       WHERE City = @City;
END;

GO

This will round up the formatting options. To test formatting settings the following unformatted code will be formated:

FROM Purchasing.Vendor v
INNER JOIN Person.BusinessEntityContact bec ON bec.BusinessEntityID 
= v.BusinessEntityID
INNER JOIN Person.ContactType ct ON ct.ContactTypeID = bec.ContactTypeID 
INNER JOIN Person.Person p
ON p.BusinessEntityID = bec.PersonID LEFT OUTER JOIN Person.EmailAddress ea
	     ON ea.BusinessEntityID = p.BusinessEntityID
	     LEFT OUTER JOIN Person.PersonPhone pp
	     ON pp.BusinessEntityID = p.BusinessEntityID
	     LEFT OUTER JOIN Person.PhoneNumberType pnt ON pnt.PhoneNumberTypeID 
= pp.PhoneNumberTypeID;

The formatting settings will give s the following coding style:

FROM Purchasing.Vendor AS v
       INNER JOIN Person.BusinessEntityContact AS bec
       ON bec.BusinessEntityID = v.BusinessEntityID
       INNER JOIN Person.ContactType AS ct
       ON ct.ContactTypeID = bec.ContactTypeID
       INNER JOIN Person.Person AS p
       ON p.BusinessEntityID = bec.PersonID
       LEFT OUTER JOIN Person.EmailAddress AS ea
       ON ea.BusinessEntityID = p.BusinessEntityID
       LEFT OUTER JOIN Person.PersonPhone AS pp
       ON pp.BusinessEntityID = p.BusinessEntityID
       LEFT OUTER JOIN Person.PhoneNumberType AS pnt
       ON pnt.PhoneNumberTypeID = pp.PhoneNumberTypeID;

Complete formatting settings are given in the XML code, which may be downloaded here

To use these settings simply unzip the downloaded XML code.

In the Options window of ApexSQL Refactor, click the Import button and import SQL coding style template to ApexSQL Refactor. In the Profile drop-down menu choose the new template and click the Set as active button:

 

January 8, 2014