SQL Formatting standards – joins, lists, structure, operations

The first part of the series – SQL Formatting standards – Capitalization, Indentation, Comments, Parenthesis, explains the importance of having clean SQL code. In short, deciphering someone else’s code is time-consuming. Clean and neat SQL code can be read faster; SQL reviewing and troubleshooting is more efficient; joint development efforts are more effective; handing off projects from one team to another is smoother than for inconsistently written SQL.

As there are neither style nor standards to format SQL, it’s up to the team to create its own set of formatting standards. Here are some recommendations to format joins, value lists, code structure, arithmetic, comparison and logical operations.

Joins

As SQL joins add more complexity, readable code here is essential. This complexity is brought on by the number of parameters that can be formatted in each JOIN: the JOIN keyword, the ON keyword, the first table in the JOIN statement, a joined object. As the number of joined tables increases, so does the importance to properly format SQL.

The most common dilemmas for developers are: should the JOIN keywords be placed on new lines, should the ON keywords be placed on new lines, should each clause in a JOIN be placed on a new line, how much to indent the ON keyword if placed on a new line?

SQL Formatting options - Joins

It’s all up to the team to decide. The following examples can help in making a decision:

Style 1 – the JOIN keyword is placed on a new line. The ON keyword is also on a new line, indented 0 spaces/aligned with the JOIN keyword:

SELECT e.BusinessEntityID , 
       p.Title , 
       pp.PhoneNumber , 
       pnt.Name AS PhoneNumberType , 
       ea.EmailAddress , 
       a.AddressLine1 , 
       sp.Name AS StateProvinceName , 
       cr.Name AS CountryRegionName
  FROM
       HumanResources.Employee e
       INNER JOIN Person.Person p
       ON p.BusinessEntityID
          = 
          e.BusinessEntityID
       INNER JOIN Person.BusinessEntityAddress bea
       ON bea.BusinessEntityID
          = 
          e.BusinessEntityID
       INNER JOIN Person.Address a
       ON a.AddressID
          = 
          bea.AddressID
       INNER JOIN Person.StateProvince sp
       ON sp.StateProvinceID
          = 
          a.StateProvinceID
       INNER JOIN Person.CountryRegion cr
       ON cr.CountryRegionCode
          = 
          sp.CountryRegionCode
       LEFT OUTER JOIN Person.PersonPhone pp
       ON pp.BusinessEntityID
          = 
          p.BusinessEntityID
       LEFT OUTER JOIN Person.PhoneNumberType pnt
       ON pp.PhoneNumberTypeID
          = 
          pnt.PhoneNumberTypeID
       LEFT OUTER JOIN Person.EmailAddress ea
       ON p.BusinessEntityID
          = 
          ea.BusinessEntityID;

Style 2 – the JOIN keyword is placed on a new line. The ON keyword is not moved to a new line:

SELECT e.BusinessEntityID , 
       p.Title , 
       pp.PhoneNumber , 
       pnt.Name AS PhoneNumberType , 
       ea.EmailAddress , 
       a.AddressLine1 , 
       sp.Name AS StateProvinceName , 
       cr.Name AS CountryRegionName
  FROM
       HumanResources.Employee e
       INNER JOIN Person.Person p ON p.BusinessEntityID
                                     = 
                                     e.BusinessEntityID
       INNER JOIN Person.BusinessEntityAddress bea ON bea.BusinessEntityID
                                                      = 
                                                      e.BusinessEntityID
       INNER JOIN Person.Address a ON a.AddressID
                                      = 
                                      bea.AddressID
       INNER JOIN Person.StateProvince sp ON sp.StateProvinceID
                                             = 
                                             a.StateProvinceID
       INNER JOIN Person.CountryRegion cr ON cr.CountryRegionCode
                                             = 
                                             sp.CountryRegionCode
       LEFT OUTER JOIN Person.PersonPhone pp ON pp.BusinessEntityID
                                                = 
                                                p.BusinessEntityID
       LEFT OUTER JOIN Person.PhoneNumberType pnt ON pp.PhoneNumberTypeID
                                                     = 
                                                     pnt.PhoneNumberTypeID
       LEFT OUTER JOIN Person.EmailAddress ea ON p.BusinessEntityID
                                                 = 
                                                 ea.BusinessEntityID;

Style 3 – the JOIN keyword is not on a new line, the ON keyword is on a new line aligned with the JOIN keyword:

SELECT e.BusinessEntityID , 
       p.Title , 
       pp.PhoneNumber , 
       pnt.Name AS PhoneNumberType , 
       ea.EmailAddress , 
       a.AddressLine1 , 
       sp.Name AS StateProvinceName , 
       cr.Name AS CountryRegionName
  FROM
       HumanResources.Employee e INNER JOIN Person.Person p
                                 ON p.BusinessEntityID
                                    = 
                                    e.BusinessEntityID
                                 INNER JOIN Person.BusinessEntityAddress bea
                                 ON bea.BusinessEntityID
                                    = 
                                    e.BusinessEntityID
                                 INNER JOIN Person.Address a
                                 ON a.AddressID
                                    = 
                                    bea.AddressID
                                 INNER JOIN Person.StateProvince sp
                                 ON sp.StateProvinceID
                                    = 
                                    a.StateProvinceID
                                 INNER JOIN Person.CountryRegion cr
                                 ON cr.CountryRegionCode
                                    = 
                                    sp.CountryRegionCode
                                 LEFT OUTER JOIN Person.PersonPhone pp
                                 ON pp.BusinessEntityID
                                    = 
                                    p.BusinessEntityID
                                 LEFT OUTER JOIN Person.PhoneNumberType pnt
                                 ON pp.PhoneNumberTypeID
                                    = 
                                    pnt.PhoneNumberTypeID
                                 LEFT OUTER JOIN Person.EmailAddress ea
                                 ON p.BusinessEntityID
                                    = 
                                    ea.BusinessEntityID;

Value lists

A table value constructor is used to specify a set of row value expressions that will be inserted into a table, using a single DML statement. It can be used in the VALUES clause of the INSERT INTO statement, the USING < source table > clause of the MERGE statement, and in the definition of a derived table in the FROM clause.

The value lists in a table value constructor, in any of the three cases above, can be formatted using the same rules set in the Lists tab under Values sub-tab.

The parameters that can be formatted are: the VALUES keyword, entire rows and row values:

SQL Formatting options - Value lists

The formatting set in the Values sub-tab is not applied to the table column list in the INSERT INTO clause. It is formatted using the rules set in the Lists tab under Columns sub-tab, of the Column lists section:

SQL Formatting options - Data statements tab

Parenthesis formatting options set in the Lists tab under Values sub-tab are applied to the value list parenthesis

Style 1 – the value list is placed on a new line, but all values are on the same line:

INSERT INTO Production.UnitMeasure(
           	    Name , 
            	    UnitMeasureCode , 
            	    ModifiedDate )
VALUES
               (N'FT2', N'Square Feet ', '20080923'), (N'Y', N'Yards', '20080923'), (N'Y3', 'Cubic Yards', '20080923');
GO

Style 2 – each row value is on a new line, with a comma placed at the end of each line:

INSERT INTO Production.UnitMeasure(
           	   Name , 
            	   UnitMeasureCode , 
            	   ModifiedDate )
VALUES
              ( N'FT2' , N'Square Feet ' , '20080923' ) , 
              ( N'Y' , N'Yards' , '20080923' ) , 
              ( N'Y3' , 'Cubic Yards' , '20080923' );
GO

Style 3 – each value is placed on a new line:

INSERT INTO Production.UnitMeasure(
            Name , 
            UnitMeasureCode , 
            ModifiedDate )
VALUES
       ( N'FT2'
       , N'Square Feet '
       , '20080923' ) , 
       ( N'Y'
       , N'Yards'
       , '20080923' ) , 
       ( N'Y3'
       , 'Cubic Yards'
       , '20080923' );
GO

Structure

Breaking SQL across lines, spacing and aligning are the biggest challenges in SQL formatting. They significantly contribute to SQL readability, so make sure that the SQL code formatting standards are properly set for them

Here are some basic guidelines that can be used when setting the SQL formatting standards for the SELECT, FROM, JOIN, WHERE, GROUP BY and ORDER BY clauses:

  • Place each keyword on a new line
  • Place arguments on a new line
  • Indented the lines per the precedent preceding keyword
  • Place all nested SELECT statements in a new line
  • Align a nested SELECT statement with a preceding keyword, or intent them 4 spaces
  • Place the FROM keyword either in a new line, or in the same line with selected columns
  • Wrap lines longer than 80 characters
  • Wrap SELECT statements longer than 30 characters

A drawback of this approach is the excessive white space and too much scrolling through SQL code

SQL Formatting options - Data statements

Style 1 – all column list items are on the same line, the FROM keyword is indented by 2 spaces, the nested SELECT statement is placed on a new line aligned with the preceding keyword:

SELECT Col_name , data_type , audit_log_data_id
  FROM dbo.AUDIT_LOG_DATA
  WHERE AUDIT_LOG_TRANSACTION_ID IN( 
        SELECT AUDIT_LOG_TRANSACTION_ID
          FROM dbo.AUDIT_LOG_TRANSACTIONS
          WHERE TABLE_NAME LIKE 'ProductPhoto' );

Style 2 – all column list items are on the same line, the FROM keyword is indented by 2 spaces, the nested SELECT statement is placed on a new line after the last character on the preceding line:

SELECT Col_name , data_type , audit_log_data_id
  FROM dbo.AUDIT_LOG_DATA
  WHERE AUDIT_LOG_TRANSACTION_ID IN( 
                                     SELECT AUDIT_LOG_TRANSACTION_ID
                                       FROM dbo.AUDIT_LOG_TRANSACTIONS
                                       WHERE TABLE_NAME LIKE 'ProductPhoto' );

Style 3 – the FROM and WHERE keywords are not moved to a new line, the SELECT statement is placed on a new line and indented by 4 spaces:

SELECT Col_name , data_type , audit_log_data_id
  FROM dbo.AUDIT_LOG_DATA WHERE AUDIT_LOG_TRANSACTION_ID IN(
    SELECT AUDIT_LOG_TRANSACTION_ID
      FROM dbo.AUDIT_LOG_TRANSACTIONS WHERE TABLE_NAME LIKE 'ProductPhoto' );

Arithmetic, comparison and logical operations

SQL readability can also be improved if arithmetic, comparison and logical operations are formatted. Be careful with line breaking for operations and spreading them across multiple lines, as it can turn 2 lines of SQL code into 20 lines, as shown in one of the examples below:

SQL Formatting options - Expressions

Style 1 – no operations are moved to new lines, nor spread across multiple lines:

SELECT employee , oldsalary , newsalary
  FROM salary
  WHERE oldsalary <= AVG( oldsalary ) AND oldsalary + @COLI * oldsalary < 
AVG( NewSalary ) OR @COLI + 10 - (AVG( oldsalary ) + 125) / 5 < 1000;

Style 2 – all operations are moved to new lines, and spread across multiple lines:

SELECT employee , oldsalary , newsalary
  FROM salary
  WHERE
       oldsalary
       <= 
       AVG( oldsalary )
   AND
       oldsalary
     + @COLI
     * oldsalary
       < 
       AVG( NewSalary )
    OR
       @COLI
     + 10
     - (
       AVG( oldsalary )
     + 125)
     / 5
       < 
       1000;

Style 3 – all operations, except arithmetic, are moved to new lines, and spread across multiple lines:

SELECT employee , oldsalary , newsalary
  FROM salary
  WHERE
       oldsalary
       <= 
       AVG( oldsalary )
   AND
       oldsalary + @COLI * oldsalary
       < 
       AVG( NewSalary )
    OR
       @COLI + 10 - (AVG( oldsalary ) + 125) / 5
       < 
       1000;

Once the rules are set, finding an easy way to enforce them comes next. Applying all the rules manually is a time-consuming process where mistakes are very likely to happen

This is where our SQL code formatter can help.

ApexSQL Refactor is a SQL Server Management Studio and Visual Studio add-in, which formats and refactors SQL utilizing over 160 formatting options. It can be used to distribute and enforce SQL formatting standards and rules among team members.

To set the formatting rules for SQL:

  1. In SQL Server Management Studio or Visual Studio, click ApexSQL Refactor from the ApexSQL menu
  2. Select the Options command, from the ApexSQL Refactor menu:

    ApexSQL Refactor SSMS menu

  3. Set the specific option as described in this article
  4. Preview the option’s effect on the current query or a built-in example:

    ApexSQL Refactor formatting options

  5. In SQL Server Management Studio or Visual Studio, go to ApexSQL menu, select ApexSQL Refactor, select the Format SQL option, and choose one of the formatting profiles to apply the formatting to SQL in the current Query Editor tab

Distributing a SQL formatting ApexSQL Refactor profile to all team members is also easy:

  1. In the Options window, click the Export button to save the formatting rules to an XML file:

    ApexSQL Refactor general formatting options

  2. Specify the file name and location
  3. Copy the XML file to a team mate’s machine
  4. In the Options window, click the Import button
  5. Navigate to the XML file

Clear guidelines and automatic implementation of formatting rules ensure that all team members, and anyone who inherits the code, can read it easily. Code wisely and don’t waste time deciphering messy SQL. Use ApexSQL Refactor to automatically implement all SQL formatting standards.

April 4, 2013