The SELECT statement in SQL is the most used of the DML statements and it’s used for fetching data from tables in SQL Server. This article is intended for DBAs and developers who want to get familiar with all the major components of the SELECT statement and what this statement is all about.
Introduction to SELECT
The syntax of the SELECT statement can be rather complex, but let’s take a brief overview of some of the major keywords used in a SELECT statement, and then we will head over to SQL Server Management Studio to see how the main clauses can be summarized in code.
SELECT statement in SQL
So, SELECT just starts the statement and it’s probably followed by a star (*) AKA “splat”. It basically means retrieve all the columns from a table. If there are multiple tables that we are selecting from, the star will select all columns from all tables e.g. when joining two or more tables.
However, it’s advisable NOT to select everything from tables. “Why?” is a subject on its own. Let’s just say there are significantly more cons than pros that can be found online using simple research. Therefore, try not to select everything, and use specific column names followed by commas that will return only columns based on what’s in the list.
Furthermore, a lot of times when there is no column list, we have a keyword TOP followed by a number (n) after the SELECT statement in SQL that returns top (n) records from a table. This is usually used with the ORDER BY clause because, for example, if we want to grab top ten sales by quantity, those can be some big numbers. So, if we leave it at the default sort order, we’ll get little numbers first. However, if we say order it descending by the quantity that will give us the top ten records by the quantity. We also use DISTINCT in some cases after SELECT that gives us unique values across the select list.
FROM
The FROM part of the SELECT statement in SQL is simply used to tell SQL Server from which table should data be fetched. A JOIN is used when we want to pull data from multiple tables. There are three different types of joins:
- Inner join – this is the default, used if no type is specified. When joining two tables on a common column, this type of join simply retrieves data that matches in both tables
- Left join – means pull all data in the left-hand table and only the data that matches the left-hand table in the righthand table
- Right join – you guessed it. It is the opposite of the left join
WHERE
The WHERE clause acts like a filter on a list of data that is coming back from tables. We can filter single or multiple columns out which will affect the data in the results set.
GROUP BY
The GROUP BY clause is associated with aggregates. If we want to do something like SUM, AVERAGE, MIN, MAX, etc. those are all aggregate functions, and GROUP BY allows us to arrange identical data into groups. In addition to this, we also have the HAVING clause which is pretty much a WHERE clause for groups. It allows us to apply filters to groups.
Examples
The above is a short overview of the SELECT statement in SQL. Let’s fire up SQL Server Management Studio and see how we can use some of that stuff. All of the following examples use the sample AdventureWorks2012 database.
This first example is as simple as it gets and returns all rows and columns using the (*) from the Product table:
SELECT * FROM Production.Product;
The results set is populated with all the columns from the Product table. The list is rather long, there are 25 columns in total that can be seen if scrolled to the right:
To select specific columns from the table, simply list column names followed by commas:
SELECT Product.ProductID ,Product.Name ,Product.ProductNumber ,Product.MakeFlag ,Product.FinishedGoodsFlag FROM Production.Product;
This time, only a subset of the columns is returned:
We could add a WHERE clause as shown below to select specific columns from a tabler with a filter:
SELECT Product.ProductID ,Product.Name ,Product.ProductNumber ,Product.MakeFlag ,Product.FinishedGoodsFlag FROM Production.Product WHERE Product.Name LIKE 'Mountain%';
Notice that we use a LIKE operator in the WHERE clause and therefore we have to specify wildcard character. In this example, the LIKE keyword says find everything that starts with “Mountain” and after that, it can be anything:
Using regular expressions can be more useful than specifying strings equality and allows the advanced search and text manipulation.
- For detailed information about how to use the LIKE operator and wildcard characters, see Overview of the SQL LIKE Operator
Let’s move on to querying data using joins. This allows us to combine data from two or more tables on common columns. Remember, SQL Server performs an INNER JOIN if only the JOIN keyword is specified.
The following query returns all product names and sales order IDs from two tables and joins them on the common ProductID column:
SELECT p.Name ,sod.SalesOrderID FROM Production.Product p JOIN Sales.SalesOrderDetail sod ON p.ProductID = sod.ProductID;
We could also return all the data from both tables by simply using a star. It cannot be seen in the shot below, but the result set is populated with the data from the Product table first followed by the SalesOrderDetail table’s data:
SELECT * FROM Production.Product p JOIN Sales.SalesOrderDetail sod ON p.ProductID = sod.ProductID;
If we only want to return the data from the Product table, add a “p.” in front of the “splat” as shown below:
SELECT p.* FROM Production.Product p JOIN Sales.SalesOrderDetail sod ON p.ProductID = sod.ProductID;
These little highlighted things in the query are called aliases. They are useful when we have long table names making the code more readable and easier to understand. Aliases are also required for databases with schema names such as this sample AdventureWorks2012 database:
For example, consider parsing the following query and see what happens:
SELECT ProductID ,Name FROM Production.Product p JOIN Sales.SalesOrderDetail sod ON p.ProductID = sod.ProductID;
The result set says “Commands completed successfully.”, so it should run just fine, right?
Well, not exactly. If we run the query, we’ll get an error saying “Ambiguous column name ‘ProductID’.” as shown below even though the syntax is correct:
This basically means two columns have the same ProductID column name. The SQL Server is confused as to which ProductID out of the two different tables we are referring to. If we add an alias to specify from which table the column should be fetched, the query will execute just fine:
SELECT p.ProductID ,-- or sod.ProductID Name FROM Production.Product p JOIN Sales.SalesOrderDetail sod ON p.ProductID = sod.ProductID;
Let’s move on and take a look at the SELECT statement in SQL using a left outer join. This type of join retrieves everything from the left-hand table and only the records that match in the right-hand table:
SELECT p.Name ,sod.SalesOrderID FROM Production.Product p LEFT JOIN Sales.SalesOrderDetail sod ON p.ProductID = sod.ProductID ORDER BY p.Name;
In this example, it joins two tables on the ProductID column and retrieves every product, no matter if it has a sale or not, and preserves the unmatched rows from the left-hand table:
Do we really need an example for the SELECT statement in SQL using a right outer join? We mentioned earlier that it’s the exact opposite. But let’s look at an example anyway:
SELECT st.Name AS Territory ,sp.BusinessEntityID FROM Sales.SalesTerritory st RIGHT JOIN Sales.SalesPerson sp ON st.TerritoryID = sp.TerritoryID;
This time, it joins two tables on the TerritoryID column and retrieves everything from the right-hand table and only the records that match in the left-hand table. All salespersons appear in the result set, no matter if they are assigned territory or not:
- For detailed information about the different types of joins, see SQL Join clause introduction and overview
Querying data using a third-party extension
It’s a known fact that nobody likes to write and read a ton of messy, unformatted text. When it comes to writing a complex SQL code that problem can become even more difficult. Querying data and writing SQL code smoothly requires years of practice and experience. Fortunately, there’s an easier way to query data using SELECT statement in SQL Server.
In this section, we’re going to take a look at a third-party tool that can help us with writing SQL code effortlessly. As a part of ApexSQL Fundamentals Toolkit for SQL Server, ApexSQL Complete extension for SSMS and VS is a well-known productivity tool that speeds up writing SQL code by automatically completing SQL statements, filling in SQL code fragments, and more.
Once the SQL code complete extension is installed into one or more hosts, it will suppress Microsoft’s native IntelliSense and take over the code completion. Enough talk, let’s open up a new query and see it in action. Start with a simple SELECT statement in SQL Server that you’re connected to. As soon as typing is started, the intelligent SQL code complete shows a hint-list with context-sensitive hints:
The hint list provides objects (name, type, schema) based on the current content of the query. Doble-click or Enter the highlighted object and it will be inserted as shown below:
USE AdventureWorks2012 GO SELECT * FROM Production.Product p
Remember how aliases are important in some cases? Auto-complete generates them automatically, by default. If a long table name is selected from the hint list, it will generate an alias using the capital letters from the object’s name:
SELECT * FROM Production.ProductModelProductDescriptionCulture pmpdc
If you don’t need aliases, you can disable them by going to Options > Inserts and unchecking the “Auto-generate aliases” option:
If you need aliases though, here you can also configure the behavior when generating them as follows:
SELECT * FROM Production.Product pro -- Use first three letters for alias SELECT * FROM Person.Person P -- Use upper case for alias SELECT * FROM Person.Person AS p -- Use AS keyword with alias
Furthermore, you can manually configure global aliases under the Aliases tab of the Options window. Here, you’ll need to specify server, database, object, and alias as shown below:
Global aliases have precedence over automatically generated ones, so if we do another SELECT statement in SQL Server, we’ll get this:
SELECT * FROM Production.Product pp
Moving on, to select specific columns from the table, remove the star and the hint list will show all the available columns. From here, simply check only needed and they will be inserted:
Remember how we said earlier that the hint list is context-sensitive? This means that as far as the syntax goes, SQL code complete will list valid suggestions in its hint list. If we continue to type a WHERE clause in the example above, it will list appropriate column names, keywords, etc.:
Auto-complete will not only fill in SQL keywords and names, but it will also help you write complex queries such as JOIN statements or cross-database queries:
- For detailed information about how to use the auto-complete feature, see Code completion
Conclusion
In this article, we’ve refreshed our memory on how to query data using SELECT statement in SQL Server. We started with a brief introduction and overview of all major components and then we jumped into SSMS and took a look at how to write everything from a basic SELECT statement in SQL to all the different types of joins.
Then we looked at an extension from ApexSQL Fundamentals Toolkit for SQL Server, ApexSQL Complete that reduces typing by automatically completing SQL statements for us. Furthermore, it automatically inserts fully qualified object names and alias on the server or database level. This tool provides an array of options that make writing SELECT statements in SQL a piece of cake.
January 4, 2021