How to query data using a SELECT statement in SQL Server

The SELECT statement in SQL is the most used of the DML statements and it’s used for fetching data from tables in SQL Server. This article is intended for DBAs and developers who want to get familiar with all the major components of the SELECT statement and what this statement is all about.

Introduction to SELECT

The syntax of the SELECT statement can be rather complex, but let’s take a brief overview of some of the major keywords used in a SELECT statement, and then we will head over to SQL Server Management Studio to see how the main clauses can be summarized in code.

SELECT statement in SQL

So, SELECT just starts the statement and it’s probably followed by a star (*) AKA “splat”. It basically means retrieve all the columns from a table. If there are multiple tables that we are selecting from, the star will select all columns from all tables e.g. when joining two or more tables.

However, it’s advisable NOT to select everything from tables. “Why?” is a subject on its own. Let’s just say there are significantly more cons than pros that can be found online using simple research. Therefore, try not to select everything, and use specific column names followed by commas that will return only columns based on what’s in the list.

Furthermore, a lot of times when there is no column list, we have a keyword TOP followed by a number (n) after the SELECT statement in SQL that returns top (n) records from a table. This is usually used with the ORDER BY clause because, for example, if we want to grab top ten sales by quantity, those can be some big numbers. So, if we leave it at the default sort order, we’ll get little numbers first. However, if we say order it descending by the quantity that will give us the top ten records by the quantity. We also use DISTINCT in some cases after SELECT that gives us unique values across the select list.

FROM

The FROM part of the SELECT statement in SQL is simply used to tell SQL Server from which table should data be fetched. A JOIN is used when we want to pull data from multiple tables. There are three different types of joins:

  • Inner join – this is the default, used if no type is specified. When joining two tables on a common column, this type of join simply retrieves data that matches in both tables
  • Left join – means pull all data in the left-hand table and only the data that matches the left-hand table in the righthand table
  • Right join – you guessed it. It is the opposite of the left join

WHERE

The WHERE clause acts like a filter on a list of data that is coming back from tables. We can filter single or multiple columns out which will affect the data in the results set.

GROUP BY

The GROUP BY clause is associated with aggregates. If we want to do something like SUM, AVERAGE, MIN, MAX, etc. those are all aggregate functions, and GROUP BY allows us to arrange identical data into groups. In addition to this, we also have the HAVING clause which is pretty much a WHERE clause for groups. It allows us to apply filters to groups.

Examples

The above is a short overview of the SELECT statement in SQL. Let’s fire up SQL Server Management Studio and see how we can use some of that stuff. All of the following examples use the sample AdventureWorks2012 database.

This first example is as simple as it gets and returns all rows and columns using the (*) from the Product table:

SELECT *
FROM Production.Product;

The results set is populated with all the columns from the Product table. The list is rather long, there are 25 columns in total that can be seen if scrolled to the right:

Results set of a query populated with all the columns from a table

To select specific columns from the table, simply list column names followed by commas:

SELECT Product.ProductID
	,Product.Name
	,Product.ProductNumber
	,Product.MakeFlag
	,Product.FinishedGoodsFlag
FROM Production.Product;

This time, only a subset of the columns is returned:

Results set of a query populated with specific columns from a table

We could add a WHERE clause as shown below to select specific columns from a tabler with a filter:

SELECT Product.ProductID
	,Product.Name
	,Product.ProductNumber
	,Product.MakeFlag
	,Product.FinishedGoodsFlag
FROM Production.Product
WHERE Product.Name LIKE 'Mountain%';

Notice that we use a LIKE operator in the WHERE clause and therefore we have to specify wildcard character. In this example, the LIKE keyword says find everything that starts with “Mountain” and after that, it can be anything:

Results set of a query filtered with a WHERE clause

Using regular expressions can be more useful than specifying strings equality and allows the advanced search and text manipulation.

Let’s move on to querying data using joins. This allows us to combine data from two or more tables on common columns. Remember, SQL Server performs an INNER JOIN if only the JOIN keyword is specified.

The following query returns all product names and sales order IDs from two tables and joins them on the common ProductID column:

SELECT p.Name
	,sod.SalesOrderID
FROM Production.Product p
      JOIN Sales.SalesOrderDetail sod ON p.ProductID = sod.ProductID;

Results set of a SELECT statement in SQL that joins two tables on the common column

We could also return all the data from both tables by simply using a star. It cannot be seen in the shot below, but the result set is populated with the data from the Product table first followed by the SalesOrderDetail table’s data:

SELECT *
FROM Production.Product p
      JOIN Sales.SalesOrderDetail sod ON p.ProductID = sod.ProductID;

Results set of a SELECT statement in SQL that joins and return all the data from two tables

If we only want to return the data from the Product table, add a “p.” in front of the “splat” as shown below:

SELECT p.*
FROM Production.Product p
    JOIN Sales.SalesOrderDetail sod ON p.ProductID = sod.ProductID;

Results set of a SELECT statement in SQL that joins and returns data only from one table

These little highlighted things in the query are called aliases. They are useful when we have long table names making the code more readable and easier to understand. Aliases are also required for databases with schema names such as this sample AdventureWorks2012 database:

Highlighted aliases in a query

For example, consider parsing the following query and see what happens:

SELECT ProductID
	,Name
FROM Production.Product p
    JOIN Sales.SalesOrderDetail sod ON p.ProductID = sod.ProductID;

The result set says “Commands completed successfully.”, so it should run just fine, right?

Commands completed successfully as a result of a parse action

Well, not exactly. If we run the query, we’ll get an error saying “Ambiguous column name ‘ProductID’.” as shown below even though the syntax is correct:

Ambiguous column name error after running a query

This basically means two columns have the same ProductID column name. The SQL Server is confused as to which ProductID out of the two different tables we are referring to. If we add an alias to specify from which table the column should be fetched, the query will execute just fine:

SELECT p.ProductID ,-- or sod.ProductID
	Name
FROM Production.Product p
    JOIN Sales.SalesOrderDetail sod ON p.ProductID = sod.ProductID;

Let’s move on and take a look at the SELECT statement in SQL using a left outer join. This type of join retrieves everything from the left-hand table and only the records that match in the right-hand table:

SELECT p.Name
	,sod.SalesOrderID
FROM Production.Product p
    LEFT JOIN Sales.SalesOrderDetail sod ON p.ProductID = sod.ProductID
ORDER BY p.Name;

In this example, it joins two tables on the ProductID column and retrieves every product, no matter if it has a sale or not, and preserves the unmatched rows from the left-hand table:

Results set of a SELECT statement in SQL with a left outer join

Do we really need an example for the SELECT statement in SQL using a right outer join? We mentioned earlier that it’s the exact opposite. But let’s look at an example anyway:

SELECT st.Name AS Territory
	,sp.BusinessEntityID
FROM Sales.SalesTerritory st
    RIGHT JOIN Sales.SalesPerson sp ON st.TerritoryID = sp.TerritoryID;

This time, it joins two tables on the TerritoryID column and retrieves everything from the right-hand table and only the records that match in the left-hand table. All salespersons appear in the result set, no matter if they are assigned territory or not:

Results set of a SELECT statement in SQL with a right outer join

Querying data using a third-party extension

It’s a known fact that nobody likes to write and read a ton of messy, unformatted text. When it comes to writing a complex SQL code that problem can become even more difficult. Querying data and writing SQL code smoothly requires years of practice and experience. Fortunately, there’s an easier way to query data using SELECT statement in SQL Server.

In this section, we’re going to take a look at a third-party tool that can help us with writing SQL code effortlessly. As a part of ApexSQL Fundamentals Toolkit for SQL Server, ApexSQL Complete extension for SSMS and VS is a well-known productivity tool that speeds up writing SQL code by automatically completing SQL statements, filling in SQL code fragments, and more.

Once the SQL code complete extension is installed into one or more hosts, it will suppress Microsoft’s native IntelliSense and take over the code completion. Enough talk, let’s open up a new query and see it in action. Start with a simple SELECT statement in SQL Server that you’re connected to. As soon as typing is started, the intelligent SQL code complete shows a hint-list with context-sensitive hints:

SQL code complete hint-list

The hint list provides objects (name, type, schema) based on the current content of the query. Doble-click or Enter the highlighted object and it will be inserted as shown below:

USE AdventureWorks2012
GO

SELECT *
FROM Production.Product p

Remember how aliases are important in some cases? Auto-complete generates them automatically, by default. If a long table name is selected from the hint list, it will generate an alias using the capital letters from the object’s name:

SELECT * FROM Production.ProductModelProductDescriptionCulture pmpdc

If you don’t need aliases, you can disable them by going to Options > Inserts and unchecking the “Auto-generate aliases” option:

Inserts tab of the Options window

If you need aliases though, here you can also configure the behavior when generating them as follows:

SELECT * FROM Production.Product pro -- Use first three letters for alias

SELECT * FROM Person.Person P -- Use upper case for alias

SELECT * FROM Person.Person AS p -- Use AS keyword with alias

Furthermore, you can manually configure global aliases under the Aliases tab of the Options window. Here, you’ll need to specify server, database, object, and alias as shown below:

Aliases tab of the Options window

Global aliases have precedence over automatically generated ones, so if we do another SELECT statement in SQL Server, we’ll get this:

SELECT * FROM Production.Product pp

Moving on, to select specific columns from the table, remove the star and the hint list will show all the available columns. From here, simply check only needed and they will be inserted:

Column picker from SQL code complete hint-list

Remember how we said earlier that the hint list is context-sensitive? This means that as far as the syntax goes, SQL code complete will list valid suggestions in its hint list. If we continue to type a WHERE clause in the example above, it will list appropriate column names, keywords, etc.:

SQL code complete hint-list with keywords

Auto-complete will not only fill in SQL keywords and names, but it will also help you write complex queries such as JOIN statements or cross-database queries:

SQL code complete hint-list with join suggestion on common columns

  • For detailed information about how to use the auto-complete feature, see Code completion

Conclusion

In this article, we’ve refreshed our memory on how to query data using SELECT statement in SQL Server. We started with a brief introduction and overview of all major components and then we jumped into SSMS and took a look at how to write everything from a basic SELECT statement in SQL to all the different types of joins.

Then we looked at an extension from ApexSQL Fundamentals Toolkit for SQL Server, ApexSQL Complete that reduces typing by automatically completing SQL statements for us. Furthermore, it automatically inserts fully qualified object names and alias on the server or database level. This tool provides an array of options that make writing SELECT statements in SQL a piece of cake.

 

January 4, 2021