T SQL Querying Guide
User Manual:
Open the PDF directly: View PDF  .
.
Page Count: 154 [warning: Documents this large are best viewed by clicking the View PDF Link!]

T-SQL Querying Guide 
A Comprehensive Guide for Learning T-SQL 

T-SQL Querying Guide  © The Knowlton Group, LLC  2 | P a g e    
Table of Contents 
About this Guide ........................................................................................................................................... 5 
About the Author .......................................................................................................................................... 6 
Online Course Material .................................................................................. Error! Bookmark not defined. 
Start Here: Installing SQL Server and Sample Data ....................................................................................... 7 
Download and Attach AdventureWorks ................................................................................................... 7 
Section 1: General Database Concepts ....................................................................................................... 10 
Section 2: Literal SELECT Statements .......................................................................................................... 12 
Lab 1: Literal Select Statements: ............................................................................................................. 13 
Section 3: Basic SELECT Statements............................................................................................................ 14 
Lab 2: Basic SELECT Statements .............................................................................................................. 19 
Section 4: Filtering with the WHERE Clause ............................................................................................... 21 
Basics of the WHERE Clause – Part 1 ...................................................................................................... 21 
Lab 3: Using the WHERE Clause Part 1 .................................................................................................... 23 
Symbolic Logic and Truth Tables ............................................................................................................. 23 
Lab 4: Symbolic Logic and Truth Table Practice ...................................................................................... 28 
Using the WHERE Clause Part 2 .............................................................................................................. 28 
Lab 5: Using the WHERE Clause Part 2 .................................................................................................... 35 
Section 5: Sorting using the ORDER BY Clause............................................................................................ 36 
Lab 6: Sorting using the ORDER BY Clause .............................................................................................. 40 
Section 6: Querying Multiple Tables via Joins............................................................................................. 41 
Normalization and Basic Database Design: ............................................................................................ 41 
Basics of the INNER JOIN ......................................................................................................................... 45 
Lab 7: INNER JOIN Practice ..................................................................................................................... 51 
LEFT OUTER JOIN and RIGHT OUTER JOIN .............................................................................................. 51 
Lab 8: Including LEFT OUTER JOINs and RIGHT OUTER JOINs ................................................................. 55 
FULL OUTER JOINs ................................................................................................................................... 56 
Section 7: Aggregate Functions .................................................................................................................. 57 
Lab 9: Aggregate Functions ..................................................................................................................... 60 
Section 8: Grouping with the GROUP BY Clause ......................................................................................... 61 
Lab 10: Grouping with the GROUP BY Clause ......................................................................................... 64 
Section 9: Filtering Groups with HAVING Clause ........................................................................................ 66 
Lab 11: Filtering Groups with the HAVING Clause .................................................................................. 68 

T-SQL Querying Guide  © The Knowlton Group, LLC  3 | P a g e    
Section 10: Built-In SQL Server Functions ................................................................................................... 69 
String Built-In Functions .......................................................................................................................... 69 
Lab 12: String Functions and Nested Functions ...................................................................................... 73 
Date and Time Built-In Functions ............................................................................................................ 73 
Lab 13: Date and Time Built In Functions ............................................................................................... 76 
NULL Handling Functions ........................................................................................................................ 76 
Lab 14: NULL Handling Functions ........................................................................................................... 77 
Section 11: SQL Server Data Types & Type Casting .................................................................................... 78 
Lab 15: SQL Server Data Types & Type Casting....................................................................................... 80 
Section 12: Table Expressions ..................................................................................................................... 81 
Derived Tables......................................................................................................................................... 81 
Lab 16: Using Derived Tables .................................................................................................................. 85 
Using Common Table Expressions .......................................................................................................... 85 
Lab 17: Common Table Expressions ....................................................................................................... 88 
Section 13: CASE Statements ...................................................................................................................... 89 
Lab 18: CASE Statements ........................................................................................................................ 93 
Section 14: Ranking Functions .................................................................................................................... 94 
Lab 19: Ranking Functions ...................................................................................................................... 99 
Section 15: Set Operations........................................................................................................................ 100 
Lab 20: Set Operations .......................................................................................................................... 103 
Section 16: Subqueries.............................................................................................................................. 105 
Inline Subqueries .................................................................................................................................. 105 
Lab 21: Inline Subquery Practice ........................................................................................................... 108 
Correlated Subqueries .......................................................................................................................... 108 
Lab 22: Using Correlated Subqueries .................................................................................................... 110 
Section 17: Advanced Aggregations and Pivoting .................................................................................... 111 
Lab 23: Advanced Aggregations and Pivoting ....................................................................................... 121 
Section 18: SQL Variables .......................................................................................................................... 122 
Lab 24: SQL Variables ............................................................................................................................ 124 
Section 19: WHILE Loops .......................................................................................................................... 125 
Lab 25: WHILE Loops ............................................................................................................................. 128 
Appendix A: Solutions for Lab Questions .................................................................................................. 130 
Lab 1: Literal SELECT Statements .......................................................................................................... 130 

T-SQL Querying Guide  © The Knowlton Group, LLC  4 | P a g e    
Lab 2: Basic SELECT Statements ............................................................................................................ 130 
Lab 3: Using the WHERE Clause Part 1 .................................................................................................. 131 
Lab 4: Symbolic Logic and Truth Tables ................................................................................................ 131 
Lab 5: Using the WHERE Clause Part 2 .................................................................................................. 133 
Lab 6: Sorting Using the ORDER BY Clause ........................................................................................... 134 
Lab 7: INNER JOIN Practice ................................................................................................................... 135 
Lab 8: LEFT OUTER JOINs and RIGHT OUTER JOINs .............................................................................. 136 
Lab 9: Aggregate Functions ................................................................................................................... 137 
Lab 10: Grouping with the GROUP BY Clause ....................................................................................... 137 
Lab 11: Filtering Groups with the GROUP BY Clause ............................................................................ 138 
Lab 12: String Functions and Nested Functions .................................................................................... 139 
Lab 13: Date and Time Built-In Functions ............................................................................................. 140 
Lab 14: NULL Handling Functions ......................................................................................................... 140 
Lab 15: SQL Server Data Types & Type Casting..................................................................................... 141 
Lab 16: Using Derived Tables ................................................................................................................ 142 
Lab 17: Common Table Expressions ..................................................................................................... 142 
Lab 18: CASE Statements ...................................................................................................................... 143 
Lab 19: Ranking Functions .................................................................................................................... 145 
Lab 20: Set Operations .......................................................................................................................... 146 
Lab 21: Inline Subquery Practice ........................................................................................................... 148 
Lab 22: Using Correlated Subqueries .................................................................................................... 149 
Lab 23: Advanced Aggregations and Pivoting ....................................................................................... 149 
Lab 24: SQL Variables ............................................................................................................................ 151 
Lab 25: WHILE Loops ............................................................................................................................. 152 

T-SQL Querying Guide  © The Knowlton Group, LLC  5 | P a g e    
About this Guide 
This guide is designed to train users that have either no SQL or database background or those who have 
some basic experience working with SQL and databases.  There are three main parts to the training 
guide.  Part one consists of sections one through eleven.  This part is geared towards those with no or 
limited prior SQL background.  Certainly, those with some prior background would benefit from the 
lessons and practice problems contained in the first part of the guide, however the main emphasis is on 
the foundation of SQL querying.   
Part two contains sections twelve through seventeen.  This can be classified as the “Intermediate SQL 
Training” section.  It contains lessons and practice problems associated with intermediate concepts such 
as: common table expressions, derived tables, subqueries and more advanced aggregations and 
pivoting.   
The final part of the guide, part three, is the “Advanced SQL Programming and Control Flow” area of the 
guide.  This guide intentionally only briefly explores some the advanced SQL concepts like variables and 
control flow.  Sections eighteen and nineteen comprise the advanced component of this guide. 
Each section will contain a learning subsection and a set of lab questions based on the 
AdventureWorks2012 database (Microsoft’s default training database).  This will give you the 
opportunity to practice in a test environment without the stress of impacting production servers before 
working in a live setting.   
If you proceed to the back of the book and look at Appendix A or Appendix B, you will find solutions to 
all practice problems contained in the guide. Be sure to use the appendix of solutions as you spend time 
practicing on your own! 

T-SQL Querying Guide  © The Knowlton Group, LLC  6 | P a g e    
About the Author 
Brewster Knowlton is the Owner and Principal Consultant of 
The Knowlton Group.  The Knowlton Group is a data and 
analytics consulting firm that specializes in helping 
organizations of all sizes and industries become data-driven.  
Brewster’s entire professional career has been dedicated 
towards developing business intelligence programs and 
solutions for clients in several industries including healthcare, 
banking, government agencies, and retail clients.  
Truly passionate about the power of data and analytics, he 
brings that passion to help each client overcome their data 
challenges and take the necessary steps towards becoming a 
data-driven credit union.  
A summa cum laude graduate of Western New England University with a degree in Mathematical 
Sciences, he was a two-time All-American as a goalie for a nationally ranked lacrosse program. He still 
volunteers his time by training youth, high school, and college lacrosse goalies. 

T-SQL Querying Guide  © The Knowlton Group, LLC  7 | P a g e    
Start Here: Installing SQL Server and Sample Data 
The first thing you need to get started is a FREE copy of Microsoft® SQL Server® 2012 Express.  Navigate 
to  
https://www.microsoft.com/en-us/download/details.aspx?id=29062 and click the red “Download” 
button. 
When the “Choose the download you want” window appears, click the check box next to 
“ENU\x64\SQLEXPRWT_x64_ENU.ex”e if you have a 64-bit machine.  If you have a 32-bit machine, 
download the file named “ENU\x86\SQLEXPRWT_x86_ENU.exe”.  Then press “Next” and download the 
executable file. 
Once downloaded, run the executable and follow the on-screen instructions to install Microsoft® SQL 
Server® 2012 Express.  Be sure to install both the database engine and SQL Server Management Studio 
(SSMS).  Management Studio is the tool that we will be used to complete our queries. 
Download and Attach AdventureWorks  
Once you have the Microsoft® SQL Server® 2012 Express successfully installed, you will need to 
download the sample AdventureWorks2012 database that the course lectures use.  Navigate to 
https://www.dropbox.com/s/igulj4m7lv73eap/AdventureWorks2012_Data.zip?dl=0 and download the 
file named “AdventureWorks2012_Database.zip”.  Once downloaded, extract the files in the zipped 

T-SQL Querying Guide  © The Knowlton Group, LLC  8 | P a g e    
folder.  Move the file with the “.mdf” extension to the folder: C:\Program Files\Microsoft SQL 
Server\MSSQL11.MSSQLSERVER\MSSQL\DATA. 
Open SQL Management Studio (SSMS) once the file is finished downloading.  Once SSMS is open, 
connect to your newly installed SQL Server Express instance.  You can type “.\sqlexpress” in the “Server 
Name” field when first opening SQL Server Management Studio.   Once connected, right click on the 
folder named “Databases”, and then select “Attach…”.   The “Attach Databases” screen will appear as 
shown below: 
Click the “Add…” button to open the “Locate Database Files” screen where you will select the file named 
“AdventureWorks2012_Data.mdf”.  This is the AdventureWorks SQL database file that we downloaded 
previously. 

T-SQL Querying Guide  © The Knowlton Group, LLC  9 | P a g e    
Press “OK” once you have selected the correct file.  Information will populate into the “Attach 
Databases” screen based on what .mdf file you selected.  Since we downloaded the database file but not 
any log file associated with it, click on the row with the word “Log” in the “File Type” column.  As an 
extra step to make sure you selected the correct file, the log row will contain the text “Not Found” in the 
“Message” column.  After confirming you have selected the correct row, click Remove and then press 
OK. 
Right click on the “Databases” folder and select Refresh.  Expand the “Databases” folder.  The 
AdventureWorks2012 database should now appear in the expanded list, and you are ready to start the 
lessons! 

T-SQL Querying Guide  © The Knowlton Group, LLC  10 | P a g e    
Section 1: General Database Concepts 
Before we dive too deep into this guide, we should cover a few basic concepts that are key to 
understanding databases (note: these concepts will be defined in the context of Microsoft SQL Server.  
Though the vast majority of these concepts are universal, you should take the time to understand the 
differences if you will be using another database engine). 
Instance: An instance can be thought of as the installation of SQL Server.  The database engine is the 
application that stores data and allows you to query against it.  Each installation of the database engine 
is a separate instance under most circumstances.  Multiple instances of SQL Server can exist on the same 
Windows server. 
Database: A database can be thought of as an organized collection of data.  The structure of relational 
databases implies that data is stored in tables.  Each database belongs to a SQL instance.  In fact, there is 
a one-to-many mapping between an instance and a database.  That is, an instance may contain one to 
many databases and a database (or many databases) belong to a single SQL Server instance. 
Schema: A schema is a physical grouping of tables in a logical way for the purposes of security and 
business understanding.  Schemas contain one-to-many tables and are often used in databases to help 
create security profiles with more ease.  The AdventureWorks sample database that we will be using has 
several schemas built into their model. 
Table: A table is a collection of data organized in the form of rows and columns.  A table is where the 
data truly lives at the lowest level.  To improve both the performance of queries and the understanding 
of the data within the table, primary and foreign keys are often included in tables. 
To better understand the hierarchical relationship between the previous concepts, take a look at the 
image below: 
Primary Key: A primary key is the column(s) that define uniqueness for each row of a table.  For 
example, a table of customer accounts might use a column named “AccountID” as the primary key.  This 

T-SQL Querying Guide  © The Knowlton Group, LLC  11 | P a g e    
forces each AccountID in the table to be unique.  Therefore, if the AccountID CID123456 appears in the 
table, you will not be able to insert another record with that same AccountID without getting an error. 
Foreign Key: A foreign key is a column in one table that identifies with a row (or rows) of another table 
based on the primary key in that other table.  For example, suppose we have a table called Sales.  This 
table contains three columns: SalesID (representing an incrementing sales number unique to each sale), 
CustomerSSN (representing the customer’s SSN), and SalesAmount.  Then suppose we have another 
table, called Customers, that contains two columns: CustomerSSN and CustomerName.  The 
CustomerSSN column in the Customers table is that table’s primary key.  This means that, as we 
discussed in the primary key definition, that each CustomerSSN appears only once in this table.  We 
would then want to set the CustomerSSN column in the Sales table to be a foreign key with a 
relationship to the CustomerSSN column in the Sales table.  This tells users that if you wanted to get 
customer details associated with each sale, you could use the foreign key relationship on CustomerSSN 
between the two tables to gather the information you are looking for.  This concept will become clearer 
as we discuss the concept of inner and outer joins. 
View: A view is a virtual table whose contents are defined by a query. Like a table, a view consists of a 
set of named columns and rows of data. Unless indexed, a view does not exist as a stored set of data 
values in a database. The rows and columns of data come from tables referenced in the query defining 
the view and are produced dynamically when the view is referenced. 

T-SQL Querying Guide  © The Knowlton Group, LLC  12 | P a g e    
Section 2: Literal SELECT Statements 
A literal SELECT statement is a SELECT statement that does not directly query a particular table and 
return columns, rather it returns the result of a string or expression.  For example, if I wanted to return 
the string “This a T-SQL Beginner Guide”, I would execute the query: 
SELECT 'This is a T-SQL Beginner Guide' 
If you wanted to return a string that said “This is a SQL query”, you would type: 
SELECT 'This is a SQL query' 
There are two key things to notice at this point.  First, all SQL SELECT statements (until we start working 
with common table expressions and other advanced SQL programming concepts) will begin with the 
word “SELECT”.  This tells the database engine that it will completing the operations related to SELECT 
statements and will most likely be returning some data as an output result set.  The second observation 
you should make is that strings in T-SQL are denoted with the single apostrophe and not a quotation 
mark like in any other programming languages.   
In the “Results” tab at the bottom panel in SQL Server Management Studio (SSMS), you will notice that it 
treats the string output as an unnamed column with a single row of data.  We can include multiple 
different strings or expressions in our literal select statement by placing a comma in between them.  For 
example, two return two strings in separate columns – one saying “SQL Query” and another saying “The 
Knowlton Group” – we would execute the query: 
SELECT 'SQL Query', 'The Knowlton Group' 
The comma in between the two strings indicates that each string will be returned in a separate column.  
Commas are T-SQL’s way of indicating multiple columns will be returned by the SELECT statement. 
Mathematical expressions are also commonly embedded in SELECT statements.  Below are a few 
examples of how we can use mathematical operators in expressions for a simple literal SELECT 
statement: 
SELECT 1+1 
SELECT 5*6 
SELECT 5*5-2 
Notice how in all of the examples, we are simply returning a scalar value – that is, a single string or 
mathematical result per query.  Literal SELECT statements by themselves don’t provide a ton of value, 
however, embedding expressions and literal strings within larger queries can provide tremendous value 
during more advanced processes or when applying mathematical operations to column values.  
Mathematical expressions in T-SQL follow the standard order of operations.  Properly placing 
parentheses can alter the order in which the expression is evaluated.  For example, take the two 
following examples.  Adjusting the parentheses within each expression yields different values: 

T-SQL Querying Guide  © The Knowlton Group, LLC  13 | P a g e    
SELECT (5*5)-3+(2*6) 
SELECT 5*(5-3+2)*6 
The first expression yields a value of 34.  The second returns 120.  Order of operations mathematically is 
an important concept to adhere to.  We will encounter a similar concept when we discuss nested 
functions which carry certain similarities to the order of operations that must be keenly observed in 
order to return the desired results. 
By now, you are aware that we are using the single apostrophe to indicate the presence of a string.  
However, you may be asking yourself how you would handle returning a string where an apostrophe is 
present in the actual string value.  To do this, you will use two apostrophes in a row.  For example, to 
return “Adam’s Apple”, you would execute the query: 
SELECT 'Adam''s Apple' 
Or to return “John’s Office”, you would type and execute: 
SELECT 'John''s Office' 
This is another subtlety that must be carefully observed while typing your SQL code to avoid raising 
errors. 
Lab 1: Literal Select Statements: 
1) Execute a literal select statement that returns your name.   
2) Write the literal select statement that evaluates the product of 7 and 4. 
3) Write the literal select statement that takes the difference of 7 and 4 then multiplies that 
difference by 8. 
4) Write a literal select statement that returns the phrase “The Knowlton Group’s SQL Training 
Class”.  (Hint: note the single apostrophe in the string). 
5) Execute a literal SELECT statement that returns the phrase “Day 1 of Training” in one column 
and the result of 5*3 in another column. 

T-SQL Querying Guide  © The Knowlton Group, LLC  14 | P a g e    
Section 3: Basic SELECT Statements 
For the purposes of this training guide, we will consider a basic SELECT statement as one that returns 
one to many columns from a single table with no additional filtering, grouping or sorting clauses.  The 
basic SELECT statement will have the following form: 
SELECT [Column 1], [Column 2], ... , [Column N] 
FROM [Database Name].[Schema Name].[Table Name] 
Let’s look at some very simply examples of a basic SELECT statement.  Ensure that you are connected to 
the “AdventureWorks2012” database by navigating to the database selection dropdown menu just 
above the Object Explorer and selecting “AdventureWorks2012”.   
You may also execute, in the query editor, the statement: 
USE AdventureWorks2012 
This command will tell the query editor that you are currently using to connect to the 
“AdventureWorks2012” database.  Typically, by default, you will initially be connected to the “master” 
database.  As a side note, you should rarely, if ever, use the “master” database, unless you are executing 
basic literal select statements. 
Once connected the “AdventureWorks2012” database, expand the “AdventureWorks2012” database 
folder in the object explorer.  When complete, expand the “Tables” folder. 

T-SQL Querying Guide  © The Knowlton Group, LLC  15 | P a g e    
This represents the complete list of tables within the “AdventureWorks2012” database.  You will notice 
that each table has some leading text, then a period, and then a name.  The leading text, before the 
period, indicates the schema to which the table belongs.  The schema “dbo” is the default schema for 
any database and table. The text after the period indicates the table name.   
Find the table named “Person.Person”.  Click the expand toggle button the left of the table name.  Once 
expanded, click the toggle button the left of “Columns”.   
You are now able to see all of the columns contained within the Person.Person table and some 
additional details such as the column data type and whether or not the column allows NULL values.  I 
will explain more about NULL values and data types in section 12 of this training guide. 
Now that we are able to identify the columns in the table Person.Person, we can execute our first basic 
SELECT statement.  Suppose we want to return the FirstName column for all rows in the Person.Person 
table, we would execute the query: 

T-SQL Querying Guide  © The Knowlton Group, LLC  16 | P a g e    
SELECT FirstName 
FROM Person.Person 
If we wanted to return only the LastName column, we would execute: 
SELECT LastName 
FROM Person.Person 
If you look in the bottom right hand corner of the results panel, you will see that SSMS indicates how 
many rows of data were returned by the previous SELECT statement that was executed. 
While nearly 20,000 rows is not a particularly large set of returned data, we want to make sure that we 
don’t write inefficient queries.  As we continue to execute SELECT statements, the database engine will 
store more and more data into memory.  Eventually, data will need to be dropped from memory (the 
data is NOT deleted from the hard drive, merely it is removed from the memory of the PC or server – 
this is where your machine’s RAM is necessary) so that new data can be added to memory and returned 
in your results.  So, we want to do our best to limit the results to only what we need.   
Since, we are just exploring the contents of a table, we can employ the TOP operator.  The TOP operator 
limits the number of rows returned by a SELECT statement. 
Let’s write a SELECT statement that returns only the top 500 rows of the FirstName column from the 
Person.Person table.  To do this, we execute: 
SELECT TOP 500 FirstName 
FROM Person.Person 
To limit the number of rows returned, we simply type TOP after SELECT and then indicate the number of 
rows we would like the result set limited to.  Instead of limiting the results by a specific number of rows, 
we can limit the rows by a percentage of the total number of rows.  Suppose we take the same query as 
before but the limit the result to ten percent of the total number of rows in the table.  To do this, we 
execute: 
SELECT TOP 10 PERCENT FirstName 
FROM Person.Person 
Having learned how to limit the number of rows returned by any SELECT statement, let’s return more 
than one column.  Let’s return the FirstName and LastName column from the Person.Person table 
limiting our results to 1000 rows.  To complete this, we execute: 
SELECT TOP 1000 FirstName, LastName 
FROM Person.Person 

T-SQL Querying Guide  © The Knowlton Group, LLC  17 | P a g e    
Suppose we wanted the top 20 percent of all rows for the FirstName, MiddleName and LastName 
columns from Person.Person.  To do this, we would execute: 
SELECT TOP 20 PERCENT FirstName, MiddleName, LastName 
FROM Person.Person 
Changing the table we are referencing, let’s return all rows from the table Production.Product and limit 
the data to just the ProductID, Name and ProductNumber columns.   
SELECT ProductID, Name, ProductNumber 
FROM Production.Product 
Occasionally, we may want to return all columns from a particular table.  Fortunately, SQL has a built in 
command to do just this.  If we wanted to return all columns from the table Production.Product, we 
would execute: 
SELECT * 
FROM Production.Product 
This returns every column that is contained in the table.  This can be helpful to get an idea of the data 
that is stored in a particular table if you are unsure of its contents.  Especially with larger tables, it is a 
best practice to apply the TOP operator to this type of statement.  Let’s use the same query as before, 
except we will limit the results to only the top 100 rows: 
SELECT TOP 100 * 
FROM Production.Product 
The asterisk or star (usually someone, when speaking about a query using an asterisk, will say “select 
star from production dot product”) is a very useful tool when writing SELECT statements and becoming 
more familiar with the data that is contained within a table.  The asterisk will also be helpful when we 
work with aggregate functions – particularly the COUNT() function. 
Now that we have a bearing on how to complete basic SELECT statements, let’s add a few details that 
will improve the clarity of your results.  We often find that business users tend to not want to see 
column names like “ProductNumber” without spaces separating the words.  While databases typically 
avoid using spaces in table names, business users like to see clean, proper words and spacing.   
To modify how the name of the column appears in the results, we can employ column aliases.  Below is 
an example of a column alias being used: 
SELECT Name AS ProductName 
FROM Production.Product 
If you look in the results panel, you will notice the column name is not “Name” rather it is 
“ProductName”.  You may be tempted to execute the query: 
SELECT Name AS Product Name 

T-SQL Querying Guide  © The Knowlton Group, LLC  18 | P a g e    
FROM Production.Product 
If you execute this query, you will be met with an error stating: 
Msg 102, Level 15, State 1, Line 1 
Incorrect syntax near 'Name'. 
This is SQL’s way of telling you that something went wrong near “Name” in the query.  The issue is that 
spaces have very specific purposes in the SQL language parser.  It treats the second “Name” like another 
column, and, therefore, is expecting a column between “Product” and “Name”.  To resolve this issue, we 
can do one a couple things: surround the column alias with quotation marks or surround the column 
alias with square brackets. 
SELECT Name AS "Product Name" 
FROM Production.Product 
SELECT Name AS [Product Name] 
FROM Production.Product 
Using the quotation or square bracket method resolves the SQL parser’s issue with the space separating 
the two words of the column alias.   Let’s apply these newly learned techniques to another example.  
We will be writing a query that returns the top 200 rows from Person.Person.  Let’s return only the 
FirstName, MiddleName and LastName columns but give them each a column alias – “First Name”, 
“Middle Name”, and “Last Name” respectively. 
SELECT TOP 200 
 FirstName AS [First Name],  
 MiddleName AS [Middle Name],  
 LastName AS [Last Name] 
FROM Person.Person 
Upon executing the query, you will notice that each of the columns are properly spaced just as the text 
between the brackets indicates.  You could have also used quotation marks in place of square brackets 
and execute this query: 
SELECT TOP 200 
 FirstName AS "First Name",  
 MiddleName AS "Middle Name",  
 LastName AS "Last Name" 
FROM Person.Person 
If you look back at Section 1 you will notice there was a definition for the term view.  This is essentially a 
virtual table that is constructed by some SELECT statement in the background.  Views are an incredibly 
helpful tool to minimize the querying effort needed to retrieve commonly accessed information.  This 
text will not discuss how views are created and maintained, but it is necessary to address how you 
access them.  Accessing a view is as simple as querying data in a table.  Instead of placing the table name 
in the FROM clause, you can place a view’s name. 

T-SQL Querying Guide  © The Knowlton Group, LLC  19 | P a g e    
The views contained within a database can be seen directly in the Object Explorer.  If you minimize the 
Tables folder, you will notice a folder named “Views”.  Expand this folder. 
This indicates all views that are accessible within the database.  Let’s execute a few queries against a 
view so that you may become comfortable with them. 
If we wanted to return all rows and columns from the view named HumanResources.vEmployee, you 
simply type and execute the query: 
SELECT * 
FROM HumanResources.vEmployee 
Or, if you wanted to return the FirstName, LastName, EmailAddress and PhoneNumber columns from 
the view Sales.vIndividualCustomer, you would execute: 
SELECT FirstName, LastName, EmailAddress, PhoneNumber 
FROM Sales.vIndividualCustomer 
As you can see, there is no noticeable difference between how you query a view and how you query 
against a table.  Views are created to make our lives easier when trying to access information, and it is 
comforting that we do not need to learn any new syntax to reference them.   
While we will continue to build upon what we learned in this chapter, understanding the concepts 
behind the basic SELECT statement is absolutely crucial before advancing to other sections.  Complete 
the exercises, and, if necessary, spend some additional time reviewing the content in this chapter to 
improve your confidence in completing basic SQL SELECT statements. 
Lab 2: Basic SELECT Statements 
1) Retrieve all rows from the HumanResources.Employee table.  Return only the 
NationalIDNumber column. 
2) Retrieve all rows from the HumanResources.Employee table.  Return the NationalIDNumber and 
JobTitle columns. 

T-SQL Querying Guide  © The Knowlton Group, LLC  20 | P a g e    
3) Retrieve the top 20 percent of rows from the HumanResources.Employee table.  Return the 
NationalIDNumber, JobTitle and BirthDate columns. 
4) Retrieve the top 500 rows from the HumanResources.Employee table.  Return the 
NationalIDNumber, JobTitle and BirthDate columns.  Give the NationalIDNumber column an 
alias, “SSN”, and the JobTitle column an alias, “Job Title”. 
5) Return all rows and all columns from the Sales.SalesOrderHeader table. 
6) Return the top 50 percent of rows and all columns from the Sales.Customer table. 
7) Return the Name column from the Production.vProductAndDescription view.  Give this column 
an alias “Product’s Name”. 
8) Return the top 400 rows from HumanResources.Department 
9) Return all rows and columns from the table named Production.BillOfMaterials 
10) Return the top 1500 rows and columns from the view named Sales.vPersonDemographics 

T-SQL Querying Guide  © The Knowlton Group, LLC  21 | P a g e    
Section 4: Filtering with the WHERE Clause 
Basics of the WHERE Clause – Part 1 
Now that we have covered the basics of retrieving columns of data from a table, you may be asking 
yourself how you might filter the rows returned by some criteria.  For example, perhaps we are asked to 
get a list of all products that have a sale price greater than $100.  You might be asked to retrieve a list of 
all customers who purchased a product within a certain date range.  These types of queries are very 
commonly requested and can be easily handled with the WHERE clause. 
We discussed in Section 3 that the generic form of a basic select statement was: 
SELECT [Column 1], [Column 2], ... , [Column N] 
FROM [Database Name].[Schema Name].[Table Name] 
By adding the WHERE clause to the general form, we now get: 
SELECT [Column 1], [Column 2], ... , [Column N] 
FROM [Database Name].[Schema Name].[Table Name] 
WHERE [Column Name] {Comparison Operator} {Filter Criteria} 
The third line of the SELECT statement will begin with WHERE and then be followed by the name of the 
column that you will be filtering against.  After indicating the column you will be filtering on, we indicate 
the comparison operator that will be used (these include =, >, < symbols – see the table of operators 
below).  From there, we specify the filtering criteria – this may be a wildcard string, string, expression or 
numeric value.  Below is a basic example of the WHERE clause in use: 
SELECT * 
FROM Production.Product 
WHERE ListPrice > 10 
The query above returns all rows and columns from the table Production.Product where the ListPrice 
column has a value greater than 10.  Notice the form of the WHERE clause: indicate the column to be 
filtered on, then indicate the comparison operator to be used, and finally specify the filtering criteria.   
There are many comparison operators that can be used in a WHERE clause.  The table below contains a 
list of them with descriptions and links to Microsoft’s technical documentation for each: 
Operator 
Meaning 
Microsoft Documentation Link 
= 
Equal to 
= (Equals)  
> 
Greater than 
> (Greater than) 
<  
Less than 
< (Less than) 
>= 
Greater than or equal to 
>= (Greater than or equal to) 
<= 
Less than or equal to 
<= (Less than or equal to) 
<> 
Not equal to 
<> (Not equal to) 
!= 
Not equal to 
!= (Not equal to) 

T-SQL Querying Guide  © The Knowlton Group, LLC  22 | P a g e    
!< 
Not less than 
!< (Not less than) 
!> 
Not greater than 
!> (Not greater than) 
Let’s walk through an example of many of these operators in use.  First, let’s find all employees in the 
HumanResources.vEmployee view whose first name is Chris.   
SELECT * 
FROM HumanResources.vEmployee 
WHERE FirstName = 'Chris' 
Notice again the basic form of the WHERE clause: identify the column you will be filtering on, then the 
comparison operator you will be used, and lastly the filtering criteria.  In this case, we are filtering on the 
FirstName column, using the “equal to” comparison operator, and specifying we only want those first 
name’s equal to “Chris”. 
Let’s modify the previous example’s filter to find all employees whose first name is NOT Chris.  To do 
this, we simply change the comparison operator from the “=” operator to the “<>” operator: 
SELECT * 
FROM HumanResources.vEmployee 
WHERE FirstName <> 'Chris' 
Next, suppose we wish to find all employees from the HumanResources.vEmployee view whose last 
name begins with a letter less than “P”.  To complete this query, we would execute: 
SELECT * 
FROM HumanResources.vEmployee 
WHERE LastName < 'P' 
When using the less than operator with a string value, the database engine will use standard 
alphabetical ordering to determine whether or not a string is less than the given filtering criteria.  For 
example, “Orange” starts with a letter less than “P” therefore “Orange” would be included in the result 
set.  However, “Pair” starts with “P” which is not less than “P” and therefore would not be included.  It 
gets a little trickier if we took the same query as above but applied the greater than operator instead of 
the less than operator. 
SELECT * 
FROM HumanResources.vEmployee 
WHERE LastName > 'P' 
You will notice that one of the last names in the result set is “Pak”.  You might be thinking that both start 
with the letter “P”, so why is “Pak” included in the results?  Think of the filtering criteria of “P” starting 
with the letter “P” and then being followed by a large number of blank spaces – “P_ _ _ _ _ _”.  So, since 
both “Pak” and “P” start with the same letter, the database engine will then go to the next letter and 
evaluate the two-letter string to determine which string is larger.  So, in the example, “Pa” is the first 
two letters of “Pak” and the first two letters of “P” can be thought of as “P” and a blank space, “P_”.  
Since the letter “a” – the second letter of “Pak” – is greater than the blank in “P_”, “Pa” is considered to 
be greater than “P_”.  This is how you would a dictionary would evaluate order, and SQL merely follows 

T-SQL Querying Guide  © The Knowlton Group, LLC  23 | P a g e    
the same evaluation criteria.  If this concept is difficult to wrap your head around, spend some time 
completing some basic queries trying different combinations of strings and operators to further 
comprehend this evaluation criteria.  The lab at the end of this section will also contain some questions 
to help clarify this concept. 
Moving on, let’s take a look at an example of the “greater than or equal to” comparison operator being 
used.  To write a query that returns all rows in the Production.Product table where the ReorderPoint 
column value is greater than or equal to 500, we would execute: 
SELECT * 
FROM Production.Product 
WHERE ReorderPoint >= 500 
Using the previous example as a starting point, we can modify the query to return all rows where the 
ReorderPoint column value is less than or equal to 500: 
SELECT * 
FROM Production.Product 
WHERE ReorderPoint <= 500 
There are three other operators that we haven’t used: the “!=”, “!<”, and “!>” operators.  These are 
considered non-ISO standard operators in the T-SQL syntax, however their use within SQL queries are 
perfectly acceptable.  The other comparison operators allow you to complete any query that the use of 
these three comparison operators would enable you to do; it is more, at this point, a matter of 
preference and being aware that these three non-ISO standard operators exist. 
Lab 3: Using the WHERE Clause Part 1 
1) Return the FirstName and LastName columns from Person.Person where the FirstName column 
is equal to “Mark” 
2) Find the top 100 rows from Production.Product where the ListPrice is not equal to 0.00 
3) Return all rows and columns from the view HumanResources.vEmployee where the employee’s 
last name starts with a letter less than “D” 
4) Return all rows and columns from Person.StateProvince where the CountryRegionCode column 
is equal to “CA” 
5) Return the FirstName and LastName columns from the view Sales.vIndividualCustomer where 
the LastName is equal to “Smith”.  Give the column alias “Customer First Name” and “Customer 
Last Name” to the FirstName and LastName columns respectively. 
Symbolic Logic and Truth Tables 
We have now seen examples of how to filter on a single column with some filtering criteria.  But what 
about filtering on multiple columns within the same query?  For this we can employ a subset of what 
SQL calls logical operators: the AND operator and the OR operator. 

T-SQL Querying Guide  © The Knowlton Group, LLC  24 | P a g e    
The query syntax used when employing these two logical operators is fairly simple.  Instead of ending 
our query with a single filtering statement after the WHERE clause, we can add one of the logical 
operators and include a second filtering criteria.  For example, if we wanted to return all rows from the 
HumanResources.vEmployee view where the employee’s first name is either Chris or Steve, we would 
execute: 
SELECT * 
FROM HumanResources.vEmployee 
WHERE FirstName = 'Chris' OR FirstName = 'Steve' 
Notice that the second criteria in the WHERE clause contains all three components of the WHERE clause: 
the column name to be filtered on, the comparison operator, and then the filtering criteria.  The only 
subtle difference is that you do not need to add “WHERE” for a second time.   
Let’s now look at an example of the AND logical operator being used.  Suppose we wanted to return all 
rows from the Production.Product table where the ListPrice value is greater than 100 and the Color 
column has a value of “Red”, we would execute the query: 
SELECT * 
FROM Production.Product 
WHERE ListPrice > 100 AND Color = 'Red' 
The WHERE clause is similar to the previous example; the only difference is the logical operator that we 
have used and the filtering criteria.   
As you may have guessed, we can string together more than two logical operators in a single WHERE 
clause.  For example, let’s suppose we wanted to find all rows in the Production.Product table that have 
a ListPrice greater than 100, a color equal to “Red”, SafetyStockLevel equal to 500, and a Size greater 
than 50.  To complete this query we execute: 
SELECT * 
FROM Production.Product 
WHERE ListPrice > 100 AND Color = 'Red' AND SafetyStockLevel = 500 AND Size > 50 
Nothing complicated has happened here; we simply add another logical operator and then another 
filtering statement.  Complexity increases when we start to mix the two different logical operators we 
have been using in the WHERE clause.  The SQL parser will evaluate the WHERE clause like the order of 
operations – that is AND operations will be evaluated before OR operations.  We will look at a few 
examples of this and show how we can explicitly ensure we get the results we would like. 
Let’s modify the previous example and try to retrieve only those rows from the Production.Product table 
where the ListPrice is greater than 100, the color is “Red” OR the StandardCost is greater than 30.  To do 
this, we would execute the query: 
SELECT * 
FROM Production.Product 

T-SQL Querying Guide  © The Knowlton Group, LLC  25 | P a g e    
WHERE ListPrice > 100 AND Color = 'Red' OR StandardCost > 30 
Examine the results for a brief moment.  You might notice that you have many rows returned where the 
color is not equal to “Red”.  You even find rows with a value in the ListPrice column that is less than 100.  
This is not a mistake – in fact this is the 100% correct data set returned by SQL.  To explain why this is 
the case, we need to take a step back and understand the concept of truth tables and properly 
evaluating Boolean expressions. 
Determining whether or not a Boolean expression returns a TRUE or FALSE value is an exact activity.  
That is, there are very clear rules to be followed – subjectivity is not a factor in determining the truth of 
a Boolean expression.  In most computer science or mathematics curriculums in college, a course in 
symbolic logic is often required for graduation and often a prerequisite for more advanced courses.  Set 
theory in mathematics and much of programming relies heavily on the concepts learned in a symbolic 
logic course.  Truth tables are one of the core concepts in this course, and they help new students of the 
subject determine whether or not a statement is true or false based on the truth of the individual 
components that make up the Boolean expression.   
A truth table breaks down a Boolean expression into its simple components.  Taking one of the filtering 
criteria from one of our earlier examples, let’s look at the query:   
SELECT * 
FROM Production.Product 
WHERE ListPrice > 100 AND Color = 'Red' 
There are two filtering components here: ListPrice is greater than 100, and Color is equal to “Red”.  
Symbolic logic courses, and truth tables, would give each of these components a letter as a symbol: let’s 
use A and B to give it basic.  So operator A is the filtering component that the ListPrice is greater than 
50, and operator B is the filtering component that the color is equal to “Red”.   
In a truth table, we would take these two components and list all possible combinations that these two 
components could be true or false (T implies TRUE and F implies FALSE).  A row’s ListPrice column could 
be greater than 100 or less than or equal to 100.  Therefore the expression “ListPrice > 100” has two 
possible outcomes: true or false.  The same goes for the second filtering criteria “Color = “Red”.  The 
Color column’s value could be “Red” or it might not be “Red”, thus returned either a TRUE or FALSE 
value for that Boolean expression.  The truth table below summarizes the combination of these possible 
outcomes: 
A 
B 
T 
T 
T 
F 
F 
T 
F 
F 
There are four different outcomes for each of these Boolean expressions: A is TRUE and B is TRUE, A is 
TRUE and B is FALSE, A is FALSE and B is TRUE, and A is FALSE and B is FALSE.  To further clarify this with 

T-SQL Querying Guide  © The Knowlton Group, LLC  26 | P a g e    
examples, if the ListPrice of a row was greater than 100 and the Color was “Red”, both A and B (the two 
filtering criteria symbolized) would be TRUE and the first row would represent the evaluation of the two 
expressions.  If the ListPrice was not greater than 100, but the Color was “Red”, then A would be FALSE 
and B would be TRUE, implying the third row symbolizes the truth of each individual component. 
Identifying a Boolean value for each component individually is fairly simple: the ListPrice is either greater 
than 100 or it is not (i.e. it is either TRUE or FALSE).  Where truth tables demonstrate their importance is 
when we apply an operator to the two individual components.  In the WHERE clause of the example we 
have been using, AND is the logical operator connecting the two filtering criteria.  Truth tables will then 
utilize a third column (in our example) so determine the truth of the two individual components when 
combined together by the logical operator in use: 
A 
B 
A and B 
T 
T 
T 
T 
F 
F 
F 
T 
F 
F 
F 
F 
Adding a third column to determine the Boolean evaluation of the expression “component A AND 
component B”, symbolic logic defines truth for the expression.  When evaluation an expression of two 
components joined together by the “AND” operator, the only time “A and B” evaluates as TRUE is if both 
individual components evaluate to TRUE.  This makes sense if we look at it in the context of the SQL 
query.  If the ListPrice is greater than 100, but the Color column does not equal to “Red” then both 
criteria are not true.  It then follows that any row where both criteria are not true, in this example, 
would not evaluate as TRUE and be returned by SQL.  When we use the “AND” operator, we explicitly 
are telling the SQL parser that we only want the rows where BOTH filtering criteria are met.  The above 
truth table simply visually represents the total possible outcomes for the expression given the 
combinations of truth for each individual components.    
Let’s take the same query but modify the comparison operator from AND to OR: 
SELECT * 
FROM Production.Product 
WHERE ListPrice > 100 OR Color = 'Red' 
Take a look at some of the results from this query.  If either the ListPrice is greater than 100 or the Color 
is equal to “Red” then the row is returned to the results panel.  Below is the truth table for two 
components joined with the “OR” operator: 
A 
B 
A or B 
T 
T 
T 
T 
F 
T 
F 
T 
T 
F 
F 
F 

T-SQL Querying Guide  © The Knowlton Group, LLC  27 | P a g e    
Notice how all rows except the last row in the truth table evaluate to TRUE when evaluating the Boolean 
expression “A or B”.  With the OR operator, as long as one of the components yields a TRUE value, then 
the entire expression returns as TRUE.  This is why some rows in the result set for the previous query 
appear in the panel when the ListPrice is greater than 100 but the Color is not “Red”. 
Now that we have covered the basics of the truth tables for the “OR” and “AND” operators for two 
components, what happens when we combine them?  If you remember, this discussion began when 
evaluating the query:  
SELECT * 
FROM Production.Product 
WHERE ListPrice > 100 AND Color = 'Red' OR StandardCost > 30 
So, let’s break this WHERE clause down into the three separate filtering components and create a truth 
table.  Component A will be “ListPrice > 100”, Component B will be “Color = ‘Red’”, and Component C 
will be “Standard Cost > 30”.  Keep in mind that the criteria will be evaluating AND operations first and 
then OR operations. Because of the way the database engine evaluates this criteria, the first Boolean 
expression to be evaluated will be “ListPrice > 100 and Color = ‘Red’” or “A and B”.  The truth table for 
this is: 
A 
B 
A and B 
T 
T 
T 
T 
F 
F 
F 
T 
F 
F 
F 
F 
Once that has been evaluated, then the database engine evaluates the next Boolean expression.  This 
next expression is the result of “A and B” OR component C.  We could visualize this as “(A and B) or C”.  
Now, the truth table for this final Boolean expression is: 
A and B 
C 
(A and B) or C 
T 
T 
T 
T 
F 
T 
F 
T 
T 
F 
F 
F 
So, regardless of whether or not (A and B) evaluates as FALSE, as long as component C is TRUE, then 
entire expression evaluates as TRUE.  This is why we had what seemed to be strange results the first 
time that we evaluated this query.  In fact, as long as one of the components of the OR expression are 
true, the entire expression is TRUE.  So, if component C is FALSE but component (A and B) evaluates to 
TRUE, then (A and B) or C evaluates as TRUE.  In terms of our SELECT statement, if the ListPrice was 
greater than 100 and the Color was “Red” but the StandardCost was not greater than 30, that particular 
row would still appear as part of our results. 

T-SQL Querying Guide  © The Knowlton Group, LLC  28 | P a g e    
This was just a very basic introduction to the concepts of symbolic logic and truth tables.  The lab 
exercises will help to improve and reinforce the concepts learned. 
Lab 4: Symbolic Logic and Truth Table Practice 
1) On a scrap piece of paper, complete the truth table for the Boolean expression, A and B. 
2) On a scrap piece of paper, complete the truth table for the Boolean expression, A or B. 
3) On a scrap piece of paper, complete the truth table for the Boolean expression, (A or B) and C. 
4) Suppose we execute the query: 
SELECT * 
FROM HumanResources.vEmployee 
WHERE FirstName < 'K' OR PhoneNumberType = 'Cell' AND EmailPromotion = 1 
Could there be a row in the result set where the employee’s PhoneNumberType equals “Work” 
and their EmailPromotion column value is 0? 
5) Using the same query from question 4, would it be possible for an employee’s FirstName to start 
with a letter greater than “K” and have a PhoneNumberType not equal to “Cell”?  Why or why 
not? 
Using the WHERE Clause Part 2 
Having covered the basics of truth tables and Boolean logic, let’s jump back into the WHERE clause.  We 
left off discussing how we can add two logical operators, “AND” and “OR”, to create multiple filters 
within the same SELECT statement.  One of the concepts we encountered is the order in which SQL 
evaluates the Boolean expression in a WHERE clause.  We do, however, have ways of altering the order 
in which expressions are evaluated with properly placed parentheses. 
Using one of the existing queries we have worked with:  
SELECT * 
FROM Production.Product 
WHERE ListPrice > 100 AND Color = 'Red' OR StandardCost > 30 
How can we modify the order in which the Boolean expression is evaluated?  Currently, it is being 
evaluated by the following truth table: 
A and B 
C 
(A and B) or C 
T 
T 
T 
T 
F 
T 
F 
T 
T 
F 
F 
F 
Perhaps, we want the order to be the Boolean expression to be “A and (B or C)” as opposed the existing 
expression “(A and B) or C”.  To make this change in the SQL statement, we placed parentheses before 
“Color = ‘Red’” and after “StandardCost > 30”: 
SELECT * 

T-SQL Querying Guide  © The Knowlton Group, LLC  29 | P a g e    
FROM Production.Product 
WHERE ListPrice > 100 AND (Color = 'Red' OR StandardCost > 30) 
Now, SQL is evaluating the Boolean expression differently – instead of treating “ListPrice > 100” and 
“Color = ‘Red’” as different components connected together by the “AND” operator, now SQL is 
evaluating “Color = ‘Red’” or “StandardCost > 30” together and then applying the AND operator with 
“ListPrice > 100”.  Looking at the size of the result set is the first indication that this change caused a 
difference – the number of rows returned now is 214.  Before we added the parentheses, the query 
returned 235 rows.  These subtle changes to the order in which the Boolean expressions are evaluated 
makes a tremendous difference in the data that is returned. 
These concepts are being stressed because we often must deal with complicated requests with many 
filters.  Understanding exactly what is being requested and how to handle these requests 
programmatically is critical to your success writing SQL SELECT statements. 
Let’s look at a few more examples.  Suppose, I wanted to find all employees from the 
HumanResources.vEmployeeDepartment view who belong to the “Research and Development” 
department and started at their department before 2005, or whose department is “Executive”.  To 
complete this query, we would execute: 
SELECT * 
FROM HumanResources.vEmployeeDepartment 
WHERE Department = 'Research and Development' AND StartDate < '1/1/2005' 
 OR Department = 'Executive' 
We could also get the same results by adding parentheses.  Adding parentheses is sometimes helpful to 
improve the readability of the query and to understand which filtering criteria you are expecting to be 
evaluated together and in what order: 
SELECT * 
FROM HumanResources.vEmployeeDepartment 
WHERE (Department = 'Research and Development' AND StartDate < '1/1/2005') 
 OR Department = 'Executive' 
With the parentheses now, it seems clearer that we are evaluating the conjunction (“conjunction” is a 
term used to describe two filtering criteria joined together by an “AND” operator) of “Department = 
‘Research and Development’” and “StartDate < ‘1/1/2005’” together first, and then evaluating the 
disjunction (the term used to describe two filtering criteria joined together by an “OR” operator) 
between the previous conjunction and “Department = ‘Executive’”.   
Suppose we wish to alter this request a bit.  Now we wish to find all employees from 
HumanResources.vEmployeeDepartment whose department equals “Research and Development” or 
their StartDate is before 2005 and their Department equals “Executive”.  We would modify the previous 
query slightly and execute: 
SELECT * 
FROM HumanResources.vEmployeeDepartment 

T-SQL Querying Guide  © The Knowlton Group, LLC  30 | P a g e    
WHERE Department = 'Research and Development' OR (StartDate < '1/1/2005' 
 AND Department = 'Executive') 
You might argue that you interpreted the request slightly differently.  Your argument to a slightly 
different interpretation of the request is completely logical and respectable.  This simply stresses the 
importance of understanding exactly what is being requested and the order in which filters are applied 
together. 
For our last example of these complicated Boolean expressions, let’s break down a very complex 
sample.  Suppose I wish to find all stores from the Sales.vStoreWithDemographics view where the 
AnnualSales were greater than 1000000 and BusinessType was equal to “OS”.  I also want to see, in the 
same result, stores that were opened before 1990 (YearOpened less than 1990), have a value in 
SquareFeet greater than 40000 and have more than ten employees.  To complete this query, we would 
type and execute: 
SELECT * 
FROM Sales.vStoreWithDemographics 
WHERE (AnnualSales > 1000000 AND BusinessType = 'OS') OR 
 ( 
    YearOpened < 1990 AND SquareFeet > 40000 AND 
    NumberEmployees > 10 
 ) 
Let’s break each filtering component down symbolically.  Let “AnnualSales > 1000000” be A, 
“BusinessType = ‘OS’” be B, “YearOpened < 1990” be C, “SquareFeet > 40000” be D, and 
“NumberEmployees > 10” be E.  As a Boolean expression, we could express this query as: 
(A and B) or (C and D and E) 
Putting together a simplified truth table for this expression (a condensed version for space and 
compactness): 
A and B 
C and D and E 
(A and B) or (C and D and E) 
T 
T 
T 
T 
F 
T 
F 
T 
T 
F 
F 
F 
So, regardless of whether or not the store had annual sales greater than a million dollars and the 
BusinessType column equaled “OS”, as long as the second portion of the disjunction, (C and D and E), 
evaluated to TRUE, the row was returned as part of the result set.  The row with BusinessEntityID 504 is 
a perfect example of those.  The annual sales were greater than a million dollars, but the BusinessType 
equals “BS” implying that the first part of the conjunction, (A and B), evaluates to FALSE.  However, the 
YearOpened value is less than 1990, the SquareFeet column is greater than 40000 and the 
NumberEmployees column exceeds 10.  Because the second part of the disjunction evaluates to TRUE, 

T-SQL Querying Guide  © The Knowlton Group, LLC  31 | P a g e    
then the whole Boolean expression evaluates TRUE.  This is represented by the third row of the previous 
truth table since (A and B) is FALSE, yet (C and D and E) is TRUE for the row we just examined. 
There are two more logical operators that we have at our disposal to use in the WHERE clause: the IN, 
BETWEEN, and LIKE operators.  The IN operator is used when we want to filter on a list of items for a 
particular column.  For example, if I wanted to find all employees from the view 
HumanResource.vEmployee whose first name was either “Chris”, “Stacy”, “Michael”, or “Li”, I could type 
out an inefficient query like: 
SELECT * 
FROM HumanResources.vEmployee 
WHERE FirstName = 'Chris' OR FirstName = 'Stacy' 
 OR FirstName = 'Michael' OR FirstName = 'Li' 
However, the IN operator allows us to make this much simpler: 
SELECT * 
FROM HumanResources.vEmployee 
WHERE FirstName IN ('Chris', 'Stacy', 'Michael', 'Li') 
We have been able to greatly condense the WHERE clause and minimize the redundancy in repeating 
“FirstName = “ for each name we wish to filter on.  We can use multiple IN operators in the same query.  
For example, suppose we wanted to find all employees with the first name in the list from the previous 
query or whose last name was either “Hill”, “Miller”, “Brown” or “Zhang”.  To find these employees, we 
would execute: 
SELECT * 
FROM HumanResources.vEmployee 
WHERE FirstName IN ('Chris', 'Stacy', 'Michael', 'Li') 
 OR LastName IN ('Hill', 'Miller', 'Brown', 'Zhang') 
In short, the IN operator allows you to condense a list of multiple filtering components separated by an 
OR operator into a simple, concise, and efficient filtering component. 
The next operator we have at our disposal is the BETWEEN operator.  This allows you to filter a column 
based on a range of values.  For example, before the BETWEEN operator, if we wanted to find all stores 
from the Sales.vStoreWithDemographics view who had annual sales between one million and two 
million, you would have to execute: 
SELECT * 
FROM Sales.vStoreWithDemographics 
WHERE AnnualSales >= 1000000 AND AnnualSales <= 2000000 
The BETWEEN operator simplifies this a bit for us.  With the BETWEEN operator, we could condense the 
previous query into: 
SELECT * 
FROM Sales.vStoreWithDemographics 
WHERE AnnualSales BETWEEN 1000000 AND 2000000 

T-SQL Querying Guide  © The Knowlton Group, LLC  32 | P a g e    
This is helpful to improve the readability of your SELECT statements and to reduce unnecessary typing.  
Note that the BETWEEN operator uses an inclusive range only; be careful if you are trying to create an 
exclusive range filter. 
The next logical operator we look at in this section is the LIKE operator.  The LIKE operator tells SQL that 
you will using a wildcard operator.  Wildcard operators allow you to search for values within a column 
based off knowing only parts of the string you are looking for.  For example, if I wanted to find all 
employees from HumanResources.vEmployee whose name starts with “Mi”, we would execute: 
SELECT * 
FROM HumanResources.vEmployee 
WHERE FirstName LIKE 'Mi%' 
The “%” symbol after “Mi” tells SQL that it will be looking for any value in the FirstName column that 
starts with “Mi” and is followed by zero to many characters after.  So, “Michael” appears in the result 
set because the name starts with “Mi” and then is followed by some amount of characters after.  If 
someone’s first name was simply “Mi” they would also be returned by the query.   
There are four wildcard characters that the LIKE operator employs: 
Wildcard Character 
Description 
% (percent symbol) 
Any string of one or more characters 
_ (underscore) 
A single character 
[ ]  
A single character confined to a specified group of allowable characters 
[ ^ ]  
A single character not contained in the specified range of allowable characters 
The underscore character allows you to use the wildcard operator for a single character.  For example, 
suppose we wanted to find all employees from HumanResources.vEmployee whose name starts with 
“Mi” and then ends with some character.  We could use the underscore wildcard character to complete 
this request: 
SELECT * 
FROM HumanResources.vEmployee 
WHERE FirstName LIKE 'Mi_' 
You’ll notice that “Min” is returned in the results.  This is the only first name in our table that starts with 
“Mi” and is followed by only a single character.  The wildcard characters do not have to appear after the 
string, but can also appear before the string.  So, if we wanted to find all employees whose first name 
starts with some letter then ends in “on”, we would execute: 
SELECT * 
FROM HumanResources.vEmployee 
WHERE FirstName LIKE '_on' 

T-SQL Querying Guide  © The Knowlton Group, LLC  33 | P a g e    
The wildcard character can be used anywhere in the string you are searching against, in fact.  The square 
bracket range wildcard character can be used like a more advanced version of the underscore character.  
This allows you to limit the range of possible characters that could appear.  Let’s find all employees 
whose name starts with a “D”, is followed by either an “a” or an “o”, and ends with an “n”.  To complete 
this query, we execute: 
SELECT * 
FROM HumanResources.vEmployee 
WHERE FirstName LIKE 'D[a,o]n' 
Instead of separating the characters in the square bracket character by commas, we can indicate a range 
of letters with a hyphen.  So, in the previous example, if we changed the criteria to search for employees 
who first name starts with a “D”, is followed by a letter between “a” and “p” in the alphabet, and then 
ends with “n”, we could execute the query: 
SELECT * 
FROM HumanResources.vEmployee 
WHERE FirstName LIKE 'D[a-p]n' 
The bracket with carat wildcard character acts similarly to the bracket character without the carat 
except that the carat indicates you do NOT want to return the characters specified.  So, to find all 
employees whose first name starts with a “D”, is followed by some character that is not an “o”, and ends 
with an “n”, we would type: 
SELECT * 
FROM HumanResources.vEmployee 
WHERE FirstName LIKE 'D[^o]n' 
Read the carat as “not” when understanding how the character can be used.  Just like with the bracket 
character, you can specify that a range of characters be excluded in the search. 
There are many things that you can do with wildcard characters that can help you when searching for 
portions of strings within a column.  Be careful not to use these excessively as the performance of the 
query will decrease significantly when you employ too many of them in a single query.  Searches with 
wildcard characters are not optimized by the database engine compared with basic queries that can 
take advantage of indexes.  When we work on live systems with larger amounts of data, this will become 
immediately apparent. 
By this point in the guide, you have written several queries that may have resulted in NULL values 
appearing somewhere in the results.  The NULL value is a very important concept to understand as we 
advance further into the training material.  A NULL value in a column implies that there is nothing for a 
value in that column.  This shouldn’t be thought of as a blank space.  A blank space is a value.  NULL is 
truly nothingness.  Handling NULL values also has certain subtleties that need to be observed to 
complete successful and error-proof SQL code. 

T-SQL Querying Guide  © The Knowlton Group, LLC  34 | P a g e    
Unlike the previous examples of the WHERE clause, trying to filter on NULL values requires slightly 
different comparison operators.  For example, to find all rows from the Person.Person table who do not 
have a middle name (i.e. the MiddleName column contains a NULL value), we would complete the 
query: 
SELECT * 
FROM Person.Person 
WHERE MiddleName IS NULL 
Notice that we do not say “WHERE MiddleName = NULL”.  Trying to do that will result in no rows 
returned: 
SELECT * 
FROM Person.Person 
WHERE MiddleName = NULL 
In a SELECT statement, you will never say a column name or value “equals” NULL.  The use of the IS 
operator is necessary in this instance.  Similarly, to find all rows where the middle name is not NULL, we 
write: 
SELECT * 
FROM Person.Person 
WHERE MiddleName IS NOT NULL 
By adding the NOT after the IS operator, we negate the filtering criteria and return only those rows with 
some value in the MiddleName column.  It is important to remember that even a blank value in the 
MiddleName value will be returned by the above query because a blank value is not equivalent to a 
NULL value.   
Like any other filtering criteria in the WHERE clause, filtering on NULL values can be combined with 
outer filtering criteria using operators like AND and OR.  For example, to find all employees from the 
view HumanResources.vEmployee who have a listed middle name and a PhoneNumberType value equal 
to “Cell”, we would execute the query: 
SELECT * 
FROM HumanResources.vEmployee 
WHERE MiddleName IS NOT NULL AND PhoneNumberType = 'Cell' 
There is absolutely no change to how we combine filtering criteria.  The only difference between 
filtering on NULL values and filtering with standard values is the use of the IS and IS NOT operator in 
place of the standard comparison operators (like “=”, “<>”, “>=”, etc.).   
The WHERE clause contains many different methodologies with which to filter your data set.  
Understand the concepts of Boolean logic and truth tables can assist you in determining how to write 
your queries to be successful.  Understanding all of the different filtering methods may take time to fully 
grasp, however continue to practice and you will eventually become quite comfortable with the 

T-SQL Querying Guide  © The Knowlton Group, LLC  35 | P a g e    
techniques.  The many lab questions that follow will help you further grasp the concepts we have 
covered in this section. 
Lab 5: Using the WHERE Clause Part 2 
1) Using the Sales.vIndividualCustomer view, find all customers with a CountryRegionName equal 
to “Australia” or all customers who have a PhoneNumberType equal to “Cell” and an 
EmailPromotion column value equal to 0.  (Hint: the correct query requires the use of 
parentheses in your WHERE clause) 
2) Find all employees from the view HumanResources.vEmployeeDepartment who have a 
Department column value in the list of: “Executive”, “Tool Design”, and “Engineering”.  
Complete this query twice – once using the IN operator in the WHERE clause and a second time 
using multiple OR operators. 
3) Using HumanResources.vEmployeeDepartment, find all employees who have a StartDate 
between July 1, 2000 and June 30, 2002.  Complete this query twice – once using the BETWEEN 
operator and then by using a combination of the “greater than or equal to” and “less than or 
equal to” operators. 
4) Find all customers from the view Sales.vIndividualCustomer whose LastName starts with the 
letter “R”.  (Hint: a wildcard character can assist you with this query) 
5) Find all customers from the view Sales.vIndividualCustomer whose LastName ends with the 
letter “r”.  (Hint: a wildcard character can assist you with this query) 
6) Find all customers from the view Sales.vIndividualCustomer whose LastName is either “Lopez”, 
“Martin”, or “Wood” and whose FirstName starts with any letter between “C” and “L” in the 
alphabet. (Hint: multiple wildcard characters will be used in this query) 
7) Return all columns from the Sales.SalesOrderHeader table for all sales that are associated with a 
sales person.  That is, return all rows where the SalesPersonID column does not contain a NULL 
value. 
8) Return the SalesPersonID and TotalDue columns from Sales.SalesOrderHeader for all sales that 
do not have a NULL value in the SalesPersonID column and whose TotalDue value exceeds 
$70,000. 

T-SQL Querying Guide  © The Knowlton Group, LLC  36 | P a g e    
Section 5: Sorting using the ORDER BY Clause 
By this point of the training guide, we should be able to successfully complete SELECT statements that 
allow us to filter our data set based on a specified criteria.  Quite a bit of complexity enters when we 
discuss filtering – hence, the extended and detailed last section on the WHERE clause.  The WHERE 
clause, however, is only the third of six standard clauses that a SELECT statement can employ.  As you 
may have guessed, this section discusses how we can sort our results in any SELECT statement. 
The ORDER BY clause will always be the last clause when typed out in a SELECT statement.  The two 
other clauses that we will discuss in subsequent sections (the GROUP BY and HAVING clauses) will 
appear before the ORDER BY clause.  Despite the ORDER BY clause appearing last in how we type a 
SELECT statement, it makes sense to discuss its usage at this point. 
As with each new clause we add to our toolbox, the SQL SELECT statement general form is expanded.  
Our general form was: 
SELECT [Column 1], [Column 2], ... , [Column N] 
FROM [Database Name].[Schema Name].[Table Name] 
WHERE [Column Name] {Comparison Operator} {Filter Criteria} 
And is now: 
SELECT [Column 1], [Column 2], ... , [Column N] 
FROM [Database Name].[Schema Name].[Table Name] 
WHERE [Column Name] {Comparison Operator} {Filter Criteria} 
ORDER BY {[Column Name], [Column Alias], [Column Ordinal]} [ASC/DESC] 
You will notice that the ORDER BY clause can use one of three possible options: a column name, a 
column alias, or a column ordinal.  Then, using the “ASC” or “DESC” option, identify whether or not you 
will be ordering the indicated column in an ascending or descending fashion.  You may choose not to 
place an “ASC” or “DESC” option after the column name, alias or ordinal, however SQL will default to 
ascending order in that instance. 
Using a column name to order your results is fairly straightforward.  Suppose we wish to return the 
FirstName and LastName columns from the view Sales.vIndividualCustomer and have our results 
ordered by the FirstName column in ascending order.  To complete this request, we would execute: 
SELECT FirstName, LastName 
FROM Sales.vIndividualCustomer 
ORDER BY FirstName ASC 
Or: 
SELECT FirstName, LastName 
FROM Sales.vIndividualCustomer 
ORDER BY FirstName 

T-SQL Querying Guide  © The Knowlton Group, LLC  37 | P a g e    
Since SQL automatically assumes that you will be ordering in ascending order - unless stated otherwise – 
the results will be the same regardless of whether we leave the “ASC” off or include it.  To improve 
readability, it might be beneficial to include the “ASC” anyway out of habit, but it is certainly not 
necessary. 
If we wanted to take the same query used in the previous example but order the results by the 
FirstName column in descending order, we would execute: 
SELECT FirstName, LastName 
FROM Sales.vIndividualCustomer 
ORDER BY FirstName DESC 
The ORDER BY clause also allows you to sort the results by identifying the column alias you wish to sort 
by.  This is possible because of the way SQL evaluates and executes a query.  When we get through the 
GROUP BY and HAVING clauses, our SELECT statements will have the complete general form (we will 
walk through the GROUP BY and HAVING clauses in later sections): 
SELECT [Column 1], [Column 2], ... , [Column N] 
FROM [Database Name].[Schema Name].[Table Name] 
WHERE [Column Name] {Comparison Operator} {Filter Criteria} 
GROUP BY [Column Name] 
HAVING {[Aggregate Function]} {Comparison Operator} {Filtering Critera} 
ORDER BY {[Column Name], [Column Alias], [Column Ordinal]} [ASC/DESC] 
As you can see, the order in which we type the clauses is: SELECT, FROM, WHERE, GROUP BY, HAVING, 
and lastly ORDER BY.  You may be asking yourself why we can use a column alias in the ORDER BY clause, 
but we couldn’t do this when using the WHERE clause.  This is because of the order in which the 
database engine actually executes the query.  Even though we type the query in the order we do, the 
database engine actually evaluates the clauses in this order: FROM, WHERE, GROUP BY, HAVING, 
SELECT, and ORDER BY.  Since the column alias is specified in the SELECT clause, the database engine 
doesn’t actually recognize the alias when it evaluating the WHERE clause.  This is because the WHERE 
clause is evaluated by SQL before the SELECT clause.  This is a subtle technical note, yet it is helpful to 
remember this when troubleshooting errors and understanding more about SQL Server as a whole.   
Using a column alias in the ORDER BY clause is as simple as using a standard column name.  Let’s take 
the previous example we worked with but give the FirstName and LastName columns two aliases: “First 
Name” and “Last Name”.  To order by the “Last Name” alias in ascending order, we would execute: 
SELECT FirstName AS [First Name], LastName AS "Last Name" 
FROM Sales.vIndividualCustomer 
ORDER BY [Last Name] ASC 
As you can see, ordering by a column alias is an identical process to ordering by a column name; the only 
difference is what you are choosing to order by. 
Ordering by the column ordinal is also a helpful option.  The term “column ordinal”, in this context, 
means the numbered position that the column appears in the SELECT clause.  For example, in the query 

T-SQL Querying Guide  © The Knowlton Group, LLC  38 | P a g e    
below the FirstName column represents column ordinal 1, and the LastName column represents column 
ordinal 2: 
SELECT FirstName, LastName 
FROM Sales.vIndividualCustomer 
So, if we wanted to order by the LastName column using the column ordinal, we would execute: 
SELECT FirstName, LastName 
FROM Sales.vIndividualCustomer 
ORDER BY 2 ASC 
And to order the LastName column in descending order: 
SELECT FirstName, LastName 
FROM Sales.vIndividualCustomer 
ORDER BY 2 DESC 
The process is completely identical to the other ordering methods, we are simply using the ordinal 
instead of the name or alias to explicitly identify how we are ordering our results. 
Order by multiple columns is also an option the ORDER BY clause allows.  Suppose we wish to return the 
AnnualSales, YearOpened and SquareFeet columns from the view Sales.vStoreWithDemographics.  Then 
suppose we want to order the results by AnnualSales in descending order and then by YearOpened in 
ascending order.  To complete this task, we execute: 
SELECT AnnualSales, YearOpened, SquareFeet 
FROM Sales.vStoreWithDemographics 
ORDER BY AnnualSales DESC, YearOpened ASC 
SQL will order our results by AnnualSales, from largest to smallest, first.  Then it orders by the 
YearOpened column from oldest to newest.  So, in the event of multiple rows having the same 
AnnualSales value, the YearOpened column will then be used to determine which rows appear in what 
order.  You can see this in some of the first few rows returned: 
You will notice that the first six rows all have the same AnnualSales and YearOpened value.  Then, in the 
seventh row, you will see the YearOpened is different – two years later than the YearOpened value in 

T-SQL Querying Guide  © The Knowlton Group, LLC  39 | P a g e    
the previous six rows.  This is where the second ordering criteria “YearOpened ASC” assists in 
determining the results order. 
While we can also order by multiple columns, we can also use different methods in identifying the 
sorting criteria.  Let’s use the same query as the previous example except let’s give the YearOpened 
column the alias “Year Opened”.  Let’s also sort by the SquareFeet column in descending order, but only 
after having sorted by the AnnualSales and YearOpened columns.  Also, we will order using the column 
alias for the YearOpened column and the column ordinal when referring to the SquareFeet column.  
Putting all of this together, we get: 
SELECT AnnualSales, YearOpened AS [Year Opened], SquareFeet 
FROM Sales.vStoreWithDemographics 
ORDER BY AnnualSales DESC, [Year Opened] ASC, 3 DESC 
Just like with the other clauses, we use all four of our SELECT statement clauses together in a single 
statement.  For example, suppose we wish to return the LastName, FirstName and SalesQuota columns 
from the Sales.vSalesPerson views.  We only want to return those rows where the SalesQuota column is 
greater than or equal to 250,000.  Lastly, we want the results ordered by the SalesQuota column (from 
largest to smallest) and then by the LastName column in ascending order.  We could complete this query 
using a few different similar queries: 
SELECT LastName, FirstName, SalesQuota 
FROM Sales.vSalesPerson 
WHERE SalesQuota >= 250000 
ORDER BY SalesQuota DESC, LastName ASC 
Or: 
SELECT LastName, FirstName, SalesQuota 
FROM Sales.vSalesPerson 
WHERE SalesQuota >= 250000 
ORDER BY 3 DESC, 1 ASC 
Or: 
SELECT LastName, FirstName, SalesQuota 
FROM Sales.vSalesPerson 
WHERE SalesQuota >= 250000 
ORDER BY 3 DESC, LastName ASC 
We can use the different sorting methods we have at our disposal to modify the look of the query.  
Notice how the WHERE clause has absolutely no effect on what we do in the ORDER BY clause.  We 
simply add our WHERE clause with the filtering criteria and then define the sorting criteria in the ORDER 
BY clause.  No complexity or complication is added despite the extra clause included in the statement. 
While there certainly are some subtleties to note, the ORDER BY clause has easy to understand rules and 
procedures. It also offers some unique flexibilities that other clauses simply do not allow.  Complete the 
labs at the end of this section to reinforce your knowledge of the ORDER BY clause. 

T-SQL Querying Guide  © The Knowlton Group, LLC  40 | P a g e    
Lab 6: Sorting using the ORDER BY Clause 
1) From the HumanResources.vEmployeeDepartment view, return the FirstName, LastName and 
JobTitle columns.  Sort the results by the FirstName column in ascending order. 
2) Modify the query from question 1 to sort the results by the FirstName column in ascending 
order and then by the LastName column in descending order. 
3) From the Sales.vIndividualCustomer view, return the FirstName, LastName and 
CountryRegionName columns.  Sort the results by the CountryRegionName column.  Use the 
column ordinal in the ORDER BY clause. 
4) From the Sales.vIndividualCustomer view, return the FirstName, LastName and 
CountryRegionName columns for those rows with a CountryRegionName that is either “United 
States” or “France”.  Sort the results by the CountryRegionName column in ascending order.  
5)  From the Sales.vStoreWithDemographics view, return the Name, AnnualSales, YearOpened, 
SquareFeet, and NumberEmployees columns.  Give the SquareFeet column the alias “Store Size” 
and the NumberEmployees column the alias “Total Employees”.  Return only those rows with 
AnnualSales greater than 1,000,000 and with NumberEmployees greater than or equal to 45.  
Order your results by the “Store Size” alias in descending order and then by the “Total 
Employees” alias in descending order.  

T-SQL Querying Guide  © The Knowlton Group, LLC  41 | P a g e    
Section 6: Querying Multiple Tables via Joins 
Normalization and Basic Database Design: 
In the majority of the examples throughout this training guide, we have queried against views instead of 
database tables.  Views are specifically designed to simplify the code required to get commonly 
requested information – like consolidated information about employees or customers.  Tables in most 
transactional databases are not designed with this consolidation in mind, rather they are designed with 
a concept of normalization in mind.  This normalization, which we will discuss momentarily, reduces the 
amount of redundancy and repetition in the data stored within a table.  These transactional databases 
are typically designed to allow for improved write operations: inserting and modifying data.  We are 
interested in read operations – returning data – when we write and execute our SELECT statements.   
Systems designed for transactions typically require a SQL user to query against multiple tables at once to 
return the information they are looking for.  Up to this point, we have only focused on how to return 
data from a single table or view.  Using a concept called joins, we will be able to return columns from 
multiple tables in the same SQL SELECT statement. 
If you remember back in Section 1, we discussed a few key database concepts.  There are two in 
particular that will be necessary to understand before talking about joins: primary keys and foreign keys.  
Before going any further, re-read the definitions of each. 
In the introduction to this section, a concept called normalization was mentioned.  Microsoft’s 
Developer Network (MSDN) defines normalization as “the process of refining tables, keys, columns, and 
relationships to create an efficient database”.  Let’s take a look at an example which will best explain 
this concept. 
Assume we have a table, named Book, that has information about books.  Each row contains 
information about the authors, publishers and information about the book itself.  Below might be an 
example of what this table looks like: 
Title 
ISBN 
Publish Date 
Publisher 
Author 1 
Author 2 
Author 3 
Book 1 
123456789 
June 1, 2008 
Publisher 1 
Author A 
Author B 
Book 2 
124356798 
September 12, 2009 
Publisher 2 
Author B 
Author C 
Book 3 
873781927 
April 6, 2011 
Publisher 3 
Author A 
Author C 
Author D 
Book 4 
826389163 
January 4, 2010 
Publisher 1 
Author B 
Author D 
Author E 
Book 5 
129471352 
December 12, 2013 
Publisher 2 
Author D 

T-SQL Querying Guide  © The Knowlton Group, LLC  42 | P a g e    
There are a few important details to notice in this table.  First, the same publishers and authors are 
repeated multiple times throughout this table.  Most books have multiple authors and certainly several 
authors write multiple books.  There exists significant repetition in each row of this table.  This table is 
known as a denormalized table – that is, repetition and redundancy exist.   
If we wanted to normalize this table, we would look at the key areas of redundancy.  First, a publisher is 
the publisher for multiple books.  However, there is no more than one publisher associated with each 
book.  The first step in normalizing this table would then be to set up a publisher table.  The table would 
look like: 
PublisherID 
Publisher Name 
1 
Publisher 1 
2 
Publisher 2 
3 
Publisher 3 
The first thing you will most likely notice is the PublisherID column.  In a normalized database (most 
databases in fact), a key or ID column will be created.  This key is usually an integer and, in this example, 
the PublisherID represents the primary key of the Publisher table.  For many reasons, including query 
performance and uniqueness identification, ID columns are almost a near necessity. 
Getting back to our normalization example, since we have created a Publisher table with a key assigned 
to each publisher, let’s go ahead and replaced “Publisher 1” with the PublisherID of 1, “Publisher 2” with 
the PublisherID of 2, and “Publisher 3” with the PublisherID of 3.  So, our main table, Book, now looks 
like: 
Title 
ISBN 
Publish Date 
Publisher 
Author 1 
Author 2 
Author 3 
Book 1 
123456789 
June 1, 2008 
1 
Author A 
Author B 
Book 2 
124356798 
September 12, 2009 
2 
Author B 
Author C 
Book 3 
873781927 
April 6, 2011 
3 
Author A 
Author C 
Author D 
Book 4 
826389163 
January 4, 2010 
1 
Author B 
Author D 
Author E 
Book 5 
129471352 
December 12, 2013 
2 
Author D 
The value in the publisher column represents a row in the Publisher table.  This Publisher column in the 
main book table can now be thought of a foreign key.  That is, the Publisher column in the Book table 
has a foreign key relationship with the PublisherID column of the Publisher table.   
Moving on, let’s handle the redundancy relating to the authors.  The relationship between authors and 
books is what is known as a “many-to-many” relationship.  This is because an author can be related to 
more than one book, while a book can be related to more than one author.  The relationship between 
publishers and books, on the other hand, was known as a “one-to-many” relationship.  That is because 

T-SQL Querying Guide  © The Knowlton Group, LLC  43 | P a g e    
one publisher can be related with more than one book, but a book can only be related to a single 
publisher. 
To handle the “many-to-many” relationship that we have between authors and books, we start just like 
we did with the publishers.  We will create an Author table that contains two columns: AuthorID and 
Author Name: 
AuthorID 
Author Name 
1 
Author A 
2 
Author B 
3 
Author C 
4 
Author D 
5 
Author E 
We will replace each author with the associated AuthorID from the Author table.  We will also create a 
BookID column for our Book table to follow proper form.  The Book table now looks like: 
BookID 
Title 
ISBN 
Publish Date 
Publisher 
Author 1 
Author 2 
Author 
3 
1 
Book 1 
123456789 
June 1, 2008 
1 
1 
2 
2 
Book 2 
124356798 
September 12, 2009 
2 
2 
3 
3 
Book 3 
873781927 
April 6, 2011 
3 
1 
3 
4 
4 
Book 4 
826389163 
January 4, 2010 
1 
2 
4 
5 
5 
Book 5 
129471352 
December 12, 2013 
2 
4 
This still isn’t quite a completely normalized table even though we now have a foreign key relationship 
on Author 1, Author 2, and Author 3 to the AuthorID column of the Author table.  For “many-to-many” 
tables, a bridge table is often need to bridge the gap between the two previous tables.  We are going to 
create this bridge table by having a table that contains two columns: BookID and AuthorID.  Our goal is 
to have the complete list of combinations of BookID and AuthorID that we see in our Book table 
represented in this new bridge table.  This new bridge table, which we will call BookAuthor, will look 
like: 
BookID 
AuthorID 
1 
1 
1 
2 
2 
2 
2 
3 
3 
1 
3 
3 

T-SQL Querying Guide  © The Knowlton Group, LLC  44 | P a g e    
3 
4 
4 
2 
4 
4 
4 
5 
5 
4 
The primary key of BookAuthor, the column or columns that define uniqueness for the table, will be the 
combination of BookID and AuthorID.  Notice how each combination is distinct; this details is important 
and will become more apparent when we start practicing joins with SQL statements. 
Since we have a bridge between the book and author(s) established, our Book table can be trimmed 
down to: 
BookID 
Title 
ISBN 
Publish Date 
Publisher 
1 
Book 1 
123456789 
June 1, 2008 
1 
2 
Book 2 
124356798 
September 12, 2009 
2 
3 
Book 3 
873781927 
April 6, 2011 
3 
4 
Book 4 
826389163 
January 4, 2010 
1 
5 
Book 5 
129471352 
December 12, 2013 
2 
Why can this be done?  All of the basic author details are stored in the Author table.  All of the 
information connecting each author to each book is stored in the BookAuthor bridge table.  We have 
eliminated the necessary redundancy in storing multiple columns of author details in the Book table.   
We now have a completely normalized database of tables.  A database diagram is an effective way to 
visualize each table’s relationship with one another.  The database diagram of our example above would 
look like: 

T-SQL Querying Guide  © The Knowlton Group, LLC  45 | P a g e    
The table adjacent the gold key in the lines connected two tables implies that table contains the primary 
key and the table with the infinity symbol contains a foreign key that references the table on the other 
end of the connecting line.  For example, Book contains a column, PublisherID, that is a foreign key and 
references the primary key of Publisher, PublisherID.   
We have successfully eliminated all possible redundancy from our original version of the Book table.  
Understanding the concepts of normalization and how data is typically organized is essential to 
becoming proficient in SQL querying.  This small, trivial example has helped us grasp the concepts of 
basic database design, primary and foreign keys, and – without you even realizing it – the underlying 
principles necessary for joins. 
Basics of the INNER JOIN 
A join, at the simplest explanation possible, is the process by which you can connect two tables 
together.  There are four types of joins: INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, and FULL 
OUTER JOIN.  An INNER JOIN can be thought of as the intersection of tables based on some key.  In well-
designed databases, an INNER JOIN (all join types, in fact), is defined by the connection of a foreign key 
from one table to the primary key in another table.  Venn diagrams are often used to visualize what 
happens when joins are completed.  With that in mind, below is the visual representation of the INNER 
JOIN: 

T-SQL Querying Guide  © The Knowlton Group, LLC  46 | P a g e    
Notice how only the space shared by both tables is shaded in the diagram above.  Let’s work through an 
example of an INNER JOIN through a few queries and further explain the concept as we go. 
Execute the query below in SQL Server Management Studio while connected to the 
AdventureWorks2012 database: 
SELECT P.Name, P.ProductNumber, PS.Name AS ProductSubCategoryName 
FROM Production.Product P 
INNER JOIN Production.ProductSubcategory PS 
ON P.ProductSubcategoryID = PS.ProductSubcategoryID 
There are a few new concepts introduced here.  First, you will notice the letters “P” and “PS” after the 
table names.  These are called table aliases.  They act similarly to column aliases in that you can use the 
alias to reference the table during your SELECT statement.  Unlike the column alias, you can use this 
table alias in all clauses of the SELECT statement.  Using table aliases when joining drastically improves 
readability of the code and minimizes unnecessary typing.  The second thing you will notice is the form 
of the JOIN.  After the FROM clause, and before any other clause, we specify the phrase “INNER JOIN” 
and then identify which table we would like to join to.  In this case, we want to join the 
Product.ProductSubcategory table.  After giving the Product.ProductSubcategory table the alias “PS”, we 
type “ON” and then indicate how we are joining the two tables together.  Knowing what column to join 
on is critical for success in joining tables together.  In this example, the ProductSubcategoryID in the 
Production.Product table is a foreign key that references the ProductSubcategoryID column in the 
Product.ProductSubcategory table. 
A valid question at this point is “how do you know which columns to join?”  The object explorer is 
helpful in this instance.  In the object explorer, expand the “Tables” folder under 
“AdventureWorks2012” if it is not already.  Find the table named “Production.Product”, and click the 
expand button to left of it.  Find the toggle button to the left of “Keys” and click it. 

T-SQL Querying Guide  © The Knowlton Group, LLC  47 | P a g e    
The first key, the gold key, contains information about the primary key of this table.  Well named keys 
often tell you exactly what columns are contained in the key.  Double click on the second key named 
“FK_Product_ProductSubcategory_ProductSubcategoryID”.  Click the blank space to the right of the 
“Table and Column Specifications” option, and click the ellipsis button to the right of that.  The “Tables 
and Columns” window will appear.  This window details how this particular key is defined.   
This table tells you that the ProductSubcategoryID in the Production.Product table (see the column on 
the right) is a foreign key with a relationship to the ProductSubcategoryID column of the 
Production.ProductSubcategory table.  Given the foreign key relationship, joining on these columns is 
designed for this use. 
Let’s take a look at the query in our example.  Run the query below and note the number of rows 
returned: 
SELECT * 
FROM Production.Product 
The table Production.Product returns 504 rows.  Now re-run our original query below and notice the 
rows returned: 

T-SQL Querying Guide  © The Knowlton Group, LLC  48 | P a g e    
SELECT P.Name, P.ProductNumber, PS.Name AS ProductSubCategoryName 
FROM Production.Product P 
INNER JOIN Production.ProductSubcategory PS 
ON P.ProductSubcategoryID = PS.ProductSubcategoryID 
This query only returns 295 rows.  Why is this the case?  Since an INNER JOIN is the intersection of two 
sets based on a column relationship, a row is returned only if the value in the joining column from one 
table matches at least one value of the joining column in another table.  That is, a row will only be 
returned if the ProductSubcategoryID column’s value in Production.Product appears in the 
ProductSubcategoryID column of the table Production.ProductSubcategory.  Re-run the query: 
SELECT * 
FROM Production.Product 
Once the results are returned, find the ProductSubcategoryID column and take a look at the values.  You 
will notice a NULL value in many of the rows for this column.  Since the value in those columns is NULL, 
that means that there is no value with which to join the row to in the Production.ProductSubcategory 
table.  Since the INNER JOIN is an intersection of two tables based on the columns defined and there are 
many NULL values in our joining column from one table, those rows will not be matches to any row in 
the other table (in this case, the Production.ProductSubcategory table).   
Let’s go back to our example in the previous sub-section that discussed basic database diagram.  After 
normalizing the initial Book table, we ended up with four tables: Author, Publisher, BookAuthor, and 
Book.  If you have not already done so, download the SQL file that will create those tables in your 
AdventureWorks2012 database and execute the code http://knowlton-group.com/sql-training-class-
resource/. 
Run the query below to view the contents of the Book table: 
SELECT * 
FROM Book 
Notice how the PublisherID is the last column.  Expand the Book table item in the object explorer, then 
explain the Keys item as well (you may need to refresh the database if the items are not visible after 
running the code). 
Double click on the second key.  Once the “Foreign Key Relationships” window appears, click the ellipses 
to the right of the “Tables and Columns Specification” option.  The “Tables and Columns” window should 
appear and contain the following contents: 

T-SQL Querying Guide  © The Knowlton Group, LLC  49 | P a g e    
As you can see, the PublisherID column in the Book table is referencing the primary key, PublisherID, 
from the Publisher table.  If we wanted to join the Book and Publisher tables, we would use the 
PublisherID to make that connection. 
Let’s do that, in fact.  We are going to create a SQL query that joins the Book table with the Publisher 
table through the PublisherID column in both tables.  We will then be able to return the details of the 
book along with the name of the publisher.  To do this, execute the query below: 
SELECT B.Title, B.ISBN, B.PublishDate, P.PublisherName 
FROM Book B 
INNER JOIN Publisher P 
ON B.PublisherID = P.PublisherID 
Notice the use of table aliases “B” and “P” to represent the Book and Publisher tables, respectively.  Our 
SELECT clause returns only the columns we wish to return.  These columns include the details about the 
book, contained in the Book table, and then the name of the publisher that is contained in the Publisher 
table.  We originate our query in the Book table as specified in the FROM clause.  Next, we indicate that 
we are going to be utilizing an INNER JOIN.  Then, we specified the table we wish to join to; in this case, 
the Publisher table is the table we are joining to.  Next, we type “ON” which tells SQL that we are about 
to tell it how we wish to connect our tables.  Lastly, we specify that we want to connect these tables 
when the PublisherID column from the Book table matches a PublisherID column in the Publisher table.   
When viewing the results of the query, none of the IDs appear, only the columns that we have indicated.  
For your own benefit, execute these two queries below: 
SELECT B.Title, B.ISBN, B.PublishDate, B.PublisherID, P.PublisherName 
FROM Book B 
INNER JOIN Publisher P 
ON P.PublisherID = B.PublisherID 
SELECT * 
FROM Publisher 
Notice how we include the PublisherID column from the Book table in the first query.  Manually validate 
that the first query returned the proper value for the PublisherName column.  Verify that each time a 

T-SQL Querying Guide  © The Knowlton Group, LLC  50 | P a g e    
“1” appears in the PublisherID column of Book that it matches the PublisherName column value 
associated with the row in the Publisher table with a “1” in the PublisherID column.  Do this for each 
book.  This is a very simple way to validate that the join you executed is performing as intended. 
Let’s look at another example.  In the AdventureWorks2012 database, there is table, named 
Person.Person, that contains some basic information about individuals in a database.  There is another 
table, Person.EmailAddress, that contains the email address for each person in the Person.Person table.  
Both tables use the column BusinessEntityID as the key identifier for the individual.  If we wanted to 
return the FirstName and LastName columns from Person.Person and the EmailAddress column from 
Person.EmailAddress, we would execute the query: 
SELECT P.FirstName, P.LastName, E.EmailAddress 
FROM Person.Person P 
INNER JOIN Person.EmailAddress E 
ON E.BusinessEntityID = P.BusinessEntityID 
This query is telling SQL to look at each BusinessEntityID in Person.Person and find the row with the 
same BusinessEntityID in the table Person.EmailAddress.  When that matching row is found, return the 
EmailAddress column.  Because this join is an INNER JOIN, rows are only returned if the joining key is 
present in both tables; in this case, the joining key is the BusinessEntityID column. 
Often times the data we need is spread across more than two tables.  How can we use the INNER JOIN 
(or any join for that matter) to handle retrieving data from more than two tables in the same query?  
The process is nearly identical to joining between two columns.  Suppose we wanted to take our 
previous example but add the person’s phone number.  The phone number is stored in the table 
Person.PersonPhone.  Person.PersonPhone contains the BusinessEntityID column that we can use to join 
each table together.  To add the phone number, execute the query: 
SELECT P.FirstName, P.LastName, E.EmailAddress, PP.PhoneNumber 
FROM Person.Person P 
INNER JOIN Person.EmailAddress E 
ON E.BusinessEntityID = P.BusinessEntityID 
INNER JOIN Person.PersonPhone PP 
ON PP.BusinessEntityID = P.BusinessEntityID 
Since each BusinessEntityID has a single row in the Person.PersonPhone table, our row count remains 
unchanged in the results.  Notice that all we needed to do was add another INNER JOIN line, specify the 
table we wanted to join to, and then specify which columns we would be joining on.  Since there is a 
“one-to-one” relationship between each of these tables – that is, each row in a table matches up to only 
one row in another one of these tables – we can modify our join slightly.  Instead of joining between 
Person.PersonPhone and Person.Person on the BusinessEntityID column, we can join between 
Person.PersonPhone and Person.EmailAddress: 
SELECT P.FirstName, P.LastName, E.EmailAddress, PP.PhoneNumber 
FROM Person.Person P 
INNER JOIN Person.EmailAddress E 
ON E.BusinessEntityID = P.BusinessEntityID 
INNER JOIN Person.PersonPhone PP 

T-SQL Querying Guide  © The Knowlton Group, LLC  51 | P a g e    
ON PP.BusinessEntityID = E.BusinessEntityID 
The concept of the INNER JOIN will become clearer with more practice.  Given that, complete the 
following lab questions to further cement your understanding of this very important aspect of SQL 
querying. 
Lab 7: INNER JOIN Practice 
1) Using the Person.Person and Person.Password tables, INNER JOIN the two tables using the 
BusinessEntityID column and return the FirstName and LastName columns from Person.Person 
and then PasswordHash column from Person.Password 
2) Join the HumanResources.Employee and the HumanResources.EmployeeDepartmentHistory 
tables together via an INNER JOIN using the BusinessEntityID column.  Return the 
BusinessEntityID, NationalIDNumber and JobTitle columns from HumanResources.Employee and 
the DepartmentID, StartDate, and EndDate columns from 
HumanResources.EmployeeDepartmentHistory.  Notice the number of rows returned.  Why is 
the row count what it is? 
3) Expand upon the query used in question 1.  Using the existing query, add another INNER JOIN to 
the Person.EmailAddress table and include the EmailAddress column in your select statement. 
4) Using the Book, BookAuthor and Author tables, join them together so that you return the Title 
and ISBN columns from Book and the AuthorName column from Author.  (Hint: You must start 
with the BookAuthor table in your FROM clause even though we will not be returning any 
columns from this table) 
5) Using the query from example 4, add another INNER JOIN that joins the Publisher table with 
your query.  Return the PublisherName column from this table.  So, you should return the Title 
and ISBN columns from Book, the AuthorName column from Author, and the PublisherName 
column from Publisher.  (Hint: this will require three separate INNER JOINs). 
LEFT OUTER JOIN and RIGHT OUTER JOIN 
What is the LEFT OUTER JOIN and RIGHT OUTER JOIN?  If you remember back to the introduction to 
INNER JOINs, it was explained that the INNER JOIN was the intersection of two sets.  The LEFT OUTER 
JOIN and RIGHT OUTER JOIN can be thought of as “everything from one set and then information from 
the intersection of both sets”.  Let’s visualize this with a Venn diagram for the LEFT OUTER JOIN: 

T-SQL Querying Guide  © The Knowlton Group, LLC  52 | P a g e    
And the Venn diagram for the RIGHT OUTER JOIN: 
One thing to note is that the LEFT OUTER JOIN and RIGHT OUTER JOIN are nearly identical – the only 
difference is which set is completed shaded.   
Looking at some examples will further clarify the LEFT OUTER JOIN and RIGHT OUTER JOIN.  If you 
remember to the last section, we executed the query: 
SELECT P.Name, P.ProductNumber, PS.Name AS ProductSubCategoryName 
FROM Production.Product P 
INNER JOIN Production.ProductSubcategory PS 
ON P.ProductSubcategoryID = PS.ProductSubcategoryID 
This query returns only 295 rows, yet the Production.Product table contains 504 rows.  We discussed 
that since this was an INNER JOIN and since some of the column values in ProductSubcategoryID were 
NULL in Production.Product, then no rows could be matched to them from 
Production.ProductSubcategory.  But suppose we wanted to return EVERY row in Production.Product 
and include the product’s subcategory name (contained in the Name column from 
Production.ProductSubcategory) if one existed.  The LEFT OUTER JOIN would solve this for us: 
SELECT P.Name, P.ProductNumber, PS.Name AS ProductSubCategoryName 
FROM Production.Product P 
LEFT OUTER JOIN Production.ProductSubcategory PS 
ON P.ProductSubcategoryID = PS.ProductSubcategoryID 

T-SQL Querying Guide  © The Knowlton Group, LLC  53 | P a g e    
Notice how there are many rows in the result set that have a NULL value in the 
ProductSubCategoryName column.  This is fine and expected; these rows have a NULL value in the 
ProductSubcategoryID column in Production.Product, so it makes sense that they wouldn’t match to 
row in Production.ProductSubcategory.  If you imagine the LEFT OUTER JOIN Venn diagram while 
viewing the results, you can see that all rows are returned from Production.Product (Table A in the 
diagram) and only the values that match as part of the intersection defined by the join from 
Production.ProductSubcategory. 
Now, execute this query: 
SELECT P.Name, P.ProductNumber, PS.Name AS ProductSubCategoryName 
FROM Production.ProductSubcategory PS 
RIGHT OUTER JOIN Production.Product P 
ON P.ProductSubcategoryID = PS.ProductSubcategoryID 
Notice how we made the Production.ProductSubcategory table in the FROM clause and shifted the 
Production.Product table to the RIGHT OUTER JOIN line.  Despite the change in code, the results are 
identical.  Why?  This is because the LEFT OUTER JOIN and RIGHT OUTER JOIN only differ in the order in 
which they are evaluated.  The RIGHT OUTER JOIN returns everything from Table B and those values for 
rows that match in Table A.  The LEFT OUTER JOIN returns everything from Table A and those values for 
rows that match in Table B.  That is the ONLY difference between the LEFT OUTER JOIN and RIGHT 
OUTER JOIN.  The LEFT OUTER JOIN evaluates the table listed first (on the “left”) and then the table 
listed second (on the “right”).   
Take this generic form query: 
SELECT A.[Column 1], A.[Column 2], B.[Column 3] 
FROM [Table A] A 
LEFT OUTER JOIN [Table B] B 
ON A.[Primary Key] = B.[Foreign Key] 
This query’s results will contain all rows from Table A and then the [Colum 3] value from Table B where 
the primary key value from Table A matches to a foreign key’s value in Table B.   
Let’s look at another example: suppose we wanted to identify the first and last name for each sales 
person associated with every single sale recorded in the Sales.SalesOrderHeader table.  However, we 
want to include all sales regardless of whether or not a sales person was listed on the sale (in this 
database, sales that were placed online do not have a sales person associated with them).  Return the 
SalesOrderNumber and TotalDue columns from Sales.SalesOrderHeader and the FirstName and 
LastName columns from Person.Person.  To complete this query, you would execute: 
SELECT SOH.SalesOrderNumber, SOH.TotalDue, P.FirstName, P.LastName 
FROM Sales.SalesOrderHeader SOH 
LEFT OUTER JOIN Sales.SalesPerson SP 
ON SP.BusinessEntityID = SOH.SalesPersonID 
LEFT OUTER JOIN Person.Person P 
ON P.BusinessEntityID = SP.BusinessEntityID 

T-SQL Querying Guide  © The Knowlton Group, LLC  54 | P a g e    
Notice how we first needed to make a LEFT OUTER JOIN between Sales.SalesOrderHeader and 
Sales.SalesPerson.  Looking at the keys of Sales.SalesOrderHeader, you will find a foreign key 
relationship between the SalesPersonID column of Sales.SalesOrderHeader and the BusinessEntityID for 
Sales.SalesPerson.  Since we wanted to return all rows from the Sales.SalesOrderHeader regardless of 
whether or not a sales person was listed (and you will notice many rows in this table have a NULL value 
in the SalesPersonID column), we needed to use a LEFT OUTER JOIN to connect the two sets.  Then, we 
want to use a LEFT OUTER JOIN to join the Sales.SalesPerson and Person.Person tables.  We use the LEFT 
OUTER JOIN because not every row in Sales.SalesOrderHeader has a SalesPersonID, therefore not every 
row that is returned will have a BusinessEntityID matching.  It then follows that since there will be a 
NULL value for the BusinessEntityID column value returned by the join of Sales.SalesOrderHeader and 
Sales.SalesPerson, we must use a LEFT OUTER JOIN to connect Sales.SalesPerson and Person.Person.  If 
we used an INNER JOIN, those BusinessEntityID values from Sales.SalesPerson would have no match to 
Person.Person and thus greatly reduce our result set. 
There are scenarios where a query will require one LEFT or RIGHT OUTER JOIN and an INNER JOIN to 
complete a request.  For example, suppose we wanted to return all rows from Sales.SalesOrderHeader 
that had a sales person listed on the sale.  In other words, we only want to return those sales where the 
SalesPersonID column in Sales.SalesOrderHeader matches a BusinessEntityID column value in 
Sales.SalesPerson.  However, we also want to view the sales territory that each sales person is 
associated with regardless of whether or not they have a listed sales territory.  Since there are rows in 
the Sales.SalesPerson table with a NULL value in the TerritoryID column, we need to use a LEFT OUER 
JOIN from Sales.SalesPerson to Sales.SalesTerritory in order to complete this request.  Putting it all 
together, we can complete the aforementioned request with the following query: 
SELECT  
 P.FirstName, P.LastName, T.Name AS TerritoryName,  
 SOH.SalesOrderNumber, SOH.TotalDue 
FROM Sales.SalesOrderHeader SOH 
INNER JOIN Sales.SalesPerson SP 
ON SP.BusinessEntityID = SOH.SalesPersonID 
INNER JOIN Person.Person P 
ON P.BusinessEntityID = SP.BusinessEntityID 
LEFT OUTER JOIN Sales.SalesTerritory T 
ON T.TerritoryID = SP.TerritoryID 
Notice the use of the INNER JOINs between Sales.SalesOrderHeader and Sales.SalesPerson.  This ensures 
that we only returns rows for sales that had a listed sales person.  The join between Sales.SalesPerson 
and Sales.SalesTerritory is a LEFT OUTER JOIN.  Had we used an INNER JOIN here, any sale for a sales 
person that wasn’t associated with a sales territory would be excluded from our results.  This would 
have caused our result set to be less than it should be.   
What about using the other clauses while using a join?  This is fairly simple; imagine that the joins are all 
part of the FROM clause.  So, in the previous example, if we wanted to only returns those rows where 
the territory’s name was “Northeast”, we would execute: 
SELECT  
 P.FirstName, P.LastName, T.Name AS TerritoryName,  

T-SQL Querying Guide  © The Knowlton Group, LLC  55 | P a g e    
 SOH.SalesOrderNumber, SOH.TotalDue 
FROM Sales.SalesOrderHeader SOH 
INNER JOIN Sales.SalesPerson SP 
ON SP.BusinessEntityID = SOH.SalesPersonID 
INNER JOIN Person.Person P 
ON P.BusinessEntityID = SP.BusinessEntityID 
LEFT OUTER JOIN Sales.SalesTerritory T 
ON T.TerritoryID = SP.TerritoryID 
WHERE T.Name = 'Northeast' 
And if we wanted to sort by the TotalDue column in descending order: 
SELECT  
 P.FirstName, P.LastName, T.Name AS TerritoryName,  
 SOH.SalesOrderNumber, SOH.TotalDue 
FROM Sales.SalesOrderHeader SOH 
INNER JOIN Sales.SalesPerson SP 
ON SP.BusinessEntityID = SOH.SalesPersonID 
INNER JOIN Person.Person P 
ON P.BusinessEntityID = SP.BusinessEntityID 
LEFT OUTER JOIN Sales.SalesTerritory T 
ON T.TerritoryID = SP.TerritoryID 
WHERE T.Name = 'Northeast' 
ORDER BY SOH.TotalDue DESC 
The different clauses can easily be employed while using joins.  Follow the same order that the general 
form requires.  The only subtle difference is that the joins “belong” to the FROM clause.  Becoming 
familiar with the general form of a SELECT statement when using joins and all clauses is critical to your 
success in querying. 
The LEFT OUTER JOIN and RIGHT OUTER JOIN will become more easily understood with continued 
practice.  Understanding when to use a LEFT OUTER JOIN or RIGHT OUTER JOIN as opposed to an INNER 
JOIN requires complete awareness of the request and the data within the database.  Progress may be 
slow when querying a new database as you may not understand every single table’s relationships.  Be 
sure to consult the object explorer and identify the foreign keys.  Feel free to write a SELECT statement 
that returns all rows from a table so that you know how many rows to expect in your results.  Like most 
things: practice makes perfect. 
Lab 8: Including LEFT OUTER JOINs and RIGHT OUTER JOINs 
1) Return the BusinessEntityID and SalesYTD column from Sales.SalesPerson.  Join this table to the 
Sales.SalesTerritory table in such a way that every in Sales.SalesPerson will be returned 
regardless of whether or not they are assigned to a territory.  Also, return the Name column 
from Sales.SalesTerritory.  Give this column the column alias “Territory Name”. 
2) Using the previous example as your foundation, join to the Person.Person table to return the 
sales person’s first name and last name.  Now, only include those rows where the territory’s 
name is either “Northeast” or “Central” 
3) Return the Name and ListPrice columns from Production.Product.  For each product, regardless 
of whether or not it has an assigned ProductSubcategoryID, return the Name column from 

T-SQL Querying Guide  © The Knowlton Group, LLC  56 | P a g e    
Product.ProductSubcategory for each product.  Then, return the Name column from 
Production.ProductCategory for each row.  Give the Name column from Production.Product the 
alias “ProductName”, the Name column from Production.ProductSubcategory the alias 
“ProductSubcategoryName”, and the Name column from Production.ProductCategory the alias 
“ProductCategoryName”.  Order the results by the “ProductCategoryName” in descending order 
and then the “ProductSubcategoryName” in ascending order. 
FULL OUTER JOINs 
We won’t spend too much time on FULL OUTER JOINs as they are less frequently used than the other 
three join types discussed in this section.  A FULL OUTER JOIN returns rows from both tables involved in 
the join regardless of whether or not a match is identified on the joining key.  Visually, the Venn diagram 
of a FULL OUTER JOIN looks like: 
Regardless of whether a match is found for the joining key, rows will be returned.  So as to not 
overcomplicate the difficult that joins may initially present, we will not do anything further with a FULL 
OUTER JOIN.  If an example later on the text presents itself and requires the FULL OUTER JOIN, 
additional time will be dedicated to explaining its usage further.  For now, simply be aware of its 
existence and basic definition. 

T-SQL Querying Guide  © The Knowlton Group, LLC  57 | P a g e    
Section 7: Aggregate Functions 
Aggregate functions are functions that perform a calculation on a set of values and return a single result.  
Aggregate functions are necessary to understand before moving on to the GROUP BY clause within a 
SQL SELECT statement.  This section will include examples for each commonly used aggregate function 
to further your understanding and ability to apply these functions in a SELECT statement. 
The first aggregate function is the MAX() function.  This function will take a set of values as its input and 
output the largest of the values it receives.  For example, to find the largest sale listed in the 
Sales.SalesOrderHeader table we would execute: 
SELECT MAX(TotalDue) 
FROM Sales.SalesOrderHeader 
Notice how we type “MAX”, then have an opening parentheses followed by the column name we are 
attempting to find the max value for, and then end with a closing parentheses.  All functions in T-SQL 
will require opening and closing parentheses; be careful to make sure you have properly closed each 
function.  The previous query, though receiving thousands of values from all the rows that have a value 
in the TotalDue column, returns a single value – the max of the TotalDue column.  Noting the scalar 
value result with be important when we start working with the HAVING clause and with subqueries in 
later sections of this guide. 
The next aggregate function we will explore is the MIN() function.  As you might have guess, this 
function takes a set of values as its input and returns the smallest value of input as a result.  For 
example, to find the smallest sales amount from the Sales.SalesOrderHeader table, we would type: 
SELECT MIN(TotalDue) 
FROM Sales.SalesOrderHeader 
Again, the query follows the same basic pattern that we saw in the MAX() function example.  An 
important note is that NULL values are ignored.  In fact, NULL values are ignored for all aggregate 
functions except the COUNT() function.  So, you will never see NULL returned as a value for the MAX() or 
MIN() function. 
Given the discussion about NULL values in the previous paragraph, the COUNT() function will be the next 
function we look at.  This function provides a count of the number of input values it receives.  Instead of 
executing a query like: 
SELECT * 
FROM Sales.SalesOrderHeader 
Where this query will return every row and every single column – a query that takes up plenty of 
memory and is poorly optimized.  We have been doing the “SELECT *” operation to find the number of 
rows that a table contains.  With the COUNT() function, we can accomplish the same objective but with 
a simpler query: 

T-SQL Querying Guide  © The Knowlton Group, LLC  58 | P a g e    
SELECT COUNT(*) 
FROM Sales.SalesOrderHeader 
Notice the single value that is returned.  You might have also noticed how much faster that query 
completed compared to the previous example.  The asterisk in between the parentheses is telling SQL 
that we don’t want to count the number of values for a particular column, but the number of rows that 
are contained in the results. 
Alternatively, we can use the COUNT() function with a column name between the parentheses. 
SELECT COUNT(SalesPersonID) 
FROM Sales.SalesOrderHeader 
However, you must be aware of the column you choose to place between the parentheses or acutely 
aware of what you are trying to accomplish with your query.  Notice the significant decrease in the 
result of the previous query compared to the one before it.  If you remember to an earlier paragraph in 
this section, it was stated that the COUNT() function does not ignore NULL values.  That is, the result of 
the function can be directly affected by the presence of NULL values.  In the previous example, the 
COUNT(SalesPersonID) was only counting the values in the SalesPersonID column that were NOT NULL.  
However, the COUNT(*) query counted all rows regardless of whether or not everything was NULL.  This 
subtlety within the COUNT() function can significantly vary your results; be sure to understand what you 
are trying to return and handle the function accordingly. 
Another useful ability the COUNT() function provides us is the distinct count.  The distinct count requires 
a column name to be specified; the asterisk cannot be used here.  For example, if we wanted to find the 
number of distinct first names in the Person.Person table, we would execute the query: 
SELECT COUNT(DISTINCT FirstName) 
FROM Person.Person 
This operation counts the distinct number of non-NULL first name values in the table.  Suppose that a 
table exists that contains sales order details for customers.  The customer identifier can be repeated 
throughout the table because a customer could have made more than one purchase.  If we were asked 
to find the distinct number of customers that purchased our products, a COUNT(DISTINCT [Customer 
Identifier]) would be a quick way to answer this request.  This is but another useful tool to keep in mind 
as you begin to resolve requests through SQL statements. 
The next aggregate function we will look at is the AVG() function.  This function take the average of all 
the non-NULL values in a column specified in the query.  So, to find the average sale amount for each 
sale recorded in the Sales.SalesOrderHeader table we would use the query: 
SELECT AVG(TotalDue) 
FROM Sales.SalesOrderHeader 

T-SQL Querying Guide  © The Knowlton Group, LLC  59 | P a g e    
The AVG() function has no subtleties like the COUNT() function possesses.  It is, like the MAX() and MIN() 
functions, fairly straight-forward in its uses and applications. 
The SUM() function is another common aggregate function.  Like the AVG(), MIN() , and MAX() 
functions, the SUM() function takes a column as an input.  Sticking with the TotalDue column from 
Sales.SalesOrderHeader, let’s find the sum of all sales that exist in our database: 
SELECT SUM(TotalDue) 
FROM Sales.SalesOrderHeader 
A more common request might be to find the total sales for a given year.  So, let’s use the SUM() 
function to find the total sales that occurred in the year 2006: 
SELECT SUM(TotalDue) AS [2006 Sales Total] 
FROM Sales.SalesOrderHeader 
WHERE OrderDate BETWEEN '1/1/2006' AND '12/31/2006' 
We included a column alias on the SUM() function to be clear about what we were summing and what 
the returned value represents.  The WHERE clause means that the values being totaled up in the SUM() 
function are only the values from the rows where the OrderDate value was between 1/1/2006 and 
12/31/2006.  If you remember the order in which SQL evaluates a SELECT statement, the WHERE clause 
is the second clause evaluated.  The SELECT clause isn’t evaluated until after the WHERE clause has been 
evaluated.  Therefore, the SELECT statement, and specifically the aggregate function we are using in the 
previous example, will only perform an operation against the rows that have passed the WHERE clause’s 
filtering. 
The values that the SUM() function receives, just like the AVG() function, must be numeric in nature.  
Logically, it does not make sense to sum up the values in a column like FirstName.  If you do attempt to 
sum a column that it is not an appropriate data type, SQL will provide an error message similar to the 
message below: 
SELECT SUM(FirstName) 
FROM Person.Person 
Error Message: 
Msg 8117, Level 16, State 1, Line 1 
Operand data type nvarchar is invalid for sum operator. 
The error message is stating that the FirstName column has the data type “nvarchar” (in Section 11 we 
will discuss data types) which is not a data type compatible for a sum operation.  We will explore these 
concepts in more detail later in the training guide. 
There are a few more aggregate functions that you should be aware of, but we will not explore in depth.  
The majority of these aggregate functions act similarly, therefore you should be able to use a new 
aggregate function based on the information you have already accumulated.  The other aggregate 
functions can be found on Microsoft’s MSDN documentation here. 

T-SQL Querying Guide  © The Knowlton Group, LLC  60 | P a g e    
Lab 9: Aggregate Functions 
1) How many rows are in the Person.Person table?  Use an aggregate function NOT “SELECT *”. 
2) How many rows in the Person.Person table do not have a NULL value in the MiddleName 
column? 
3) What is the average StandardCost (located in Production.Product) for each product where the 
StandardCost is greater than $0.00? 
4) What is the average Freight amount for each sale (found in Sales.SalesOrderHeader) where the 
sale took place in TerritoryID 4? 
5) How expensive is the most expensive product, by ListPrice, in the table Production.Product? 
6) Join the Production.Product table and the Production.ProductInventory table for only the 
products that appear in both table.  Use the ProductID as the joining column.  
Production.ProductInventory contains the quantity of each product (several rows can appear for 
each product to indicate the product appears in multiple locations).  Your goal is to determine 
how much money we would earn if we sold every product for its list price for each product with 
a ListPrice greater than $0.  That is, if you summed the product of each product’s inventory by 
its list price, what would that value be?  (Hint: This is intentionally challenging.  You must use an 
aggregate function with a mathematical expression to accomplish your goal) 

T-SQL Querying Guide  © The Knowlton Group, LLC  61 | P a g e    
Section 8: Grouping with the GROUP BY Clause 
Grouping is a common task required to complete effective SQL queries.  A request like “show me all 
sales in the last year and who the sales person was” is less common than “show me each sales person’s 
YTD sales”.  The first request wants every sale in its own row, while the second request only wants one 
row per sales person.  To complete this type of request, the GROUP BY clause is required. 
The GROUP BY clause is placed immediately after the WHERE clause in the SELECT statement general 
form: 
SELECT [Column 1], [Column 2], ... , [Column N] 
FROM [Database Name].[Schema Name].[Table Name] 
WHERE [Column Name] {Comparison Operator} {Filter Criteria} 
GROUP BY [Column Name] 
ORDER BY {[Column Name], [Column Alias], [Column Ordinal]} [ASC/DESC] 
The GROUP BY clause will take every row that passes the WHERE clause filters and group the result 
based on the column that you specify.  Typically, the GROUP BY clause is used in conjunction with an 
aggregate function in the SELECT clause.  Let’s take a look at an example and explain further: 
SELECT SalesPersonID, SUM(TotalDue) AS [Total Sales] 
FROM Sales.SalesOrderHeader 
GROUP BY SalesPersonID 
This query takes all rows from the Sales.SalesOrderHeader and then groups them by the unique 
SalesPersonID values.  Then, it sums the TotalDue column for each group of SalesPersonIDs and returns 
the result accordingly.  We MUST use an aggregate function or SQL will return an error.  If we instead 
used this incorrect query: 
SELECT SalesPersonID, TotalDue 
FROM Sales.SalesOrderHeader 
GROUP BY SalesPersonID 
We would receive the error: 
Msg 8120, Level 16, State 1, Line 1 
Column 'Sales.SalesOrderHeader.TotalDue' is invalid in the select list because it is not 
contained in either an aggregate function or the GROUP BY clause. 
This error message is very clear: if you are going to complete a SELECT statement with a GROUP BY 
clause either every column in the SELECT clause needs to appear in the GROUP BY clause, or those 
columns not specified in the GROUP BY clause must be aggregate functions.  You cannot display a group 
without aggregating the values in that group.  SQL is confused in the above query because we insist that 
we want to group our results by the SalesPersonID column but then want to return an ungrouped or 
unaggregated column, TotalDue.  If you see the above error message when writing queries with the 
GROUP BY clause, immediately look to your SELECT clause and ensure that every column is in the 
GROUP BY clause or that those columns not in the GROUP BY clause and part of an aggregate function.   

T-SQL Querying Guide  © The Knowlton Group, LLC  62 | P a g e    
Let’s look at another example.  The Production.ProductInventory table contains information about our 
product’s current inventory.  Each ProductID may have multiple rows due to the design of the table.  If 
we wanted to find, for each ProductID, the total number of inventory we currently possess, we would 
execute the query: 
SELECT ProductID, SUM(Quantity) AS [Total Quantity] 
FROM Production.ProductInventory 
GROUP BY ProductID 
If we wanted to use the same query as above but add another column for how many different locations 
the product is stored (i.e. the number of rows in the table for each ProductID), we could execute: 
SELECT  
 ProductID,  
 SUM(Quantity) AS [Total Quantity],  
 COUNT(*) AS [Locations Product is Stored] 
FROM Production.ProductInventory 
GROUP BY ProductID 
We can also group by multiple columns in the same query to define, in essence, a hierarchy.  Using the 
Sales.SalesOrderHeader, we will determine total sales from 1/1/2006 to 12/31/2006 by territory and 
then by sales person.  This would look like: 
SELECT TerritoryID, SalesPersonID, SUM(TotalDue) 
FROM Sales.SalesOrderHeader 
WHERE OrderDate BETWEEN '1/1/2006' AND '12/31/2006' 
GROUP BY TerritoryID, SalesPersonID 
ORDER BY 1,2 
Notice that some sales took place in a specific territory with no sales person assigned to the sale, hence 
the NULL value in the SalesPersonID column for some rows.  The query has taken the rows from 
Sales.SalesOrderHeader that pass the WHERE clause filter and then groups them first by the TerritoryID 
and then breaks them down further by SalesPersonID.  Finally, after the groups have been established, 
SQL sums the TotalDue column based on these groups then sorts and outputs the results.   
Using the previous query, let’s make it cleaner.  We would rarely want to give a business user results 
that had columns like TerritoryID or SalesPersonID.  Instead, we would want the name of the territory or 
sales person.  For the next query, we will be joining to the tables that contains the names for the 
aforementioned ID columns.  We will be using INNER JOINs to simplify the results – that is, we will find 
no NULL values in the results for the territory name or sales person name column. 
SELECT  
 ST.Name AS [Territory Name],  
 P.FirstName AS [SalesPerson First Name],  
 P.LastName AS [SalesPerson Last Name],  
 SUM(TotalDue) AS [Total 2006 Sales] 
FROM Sales.SalesOrderHeader SOH 
INNER JOIN Sales.SalesPerson SP 
ON SP.BusinessEntityID = SOH.SalesPersonID 

T-SQL Querying Guide  © The Knowlton Group, LLC  63 | P a g e    
INNER JOIN Person.Person P 
ON P.BusinessEntityID = SP.BusinessEntityID 
INNER JOIN Sales.SalesTerritory ST 
ON ST.TerritoryID = SOH.TerritoryID 
WHERE OrderDate BETWEEN '1/1/2006' AND '12/31/2006' 
GROUP BY ST.Name, P.FirstName, P.LastName 
ORDER BY 1,2 
Though this query looks intimidating, we are not introducing anything new or complicated.  We are 
using INNER JOINs to get the territory name (via Sales.SalesTerritory) and the sales person’s name (we 
must join twice to return this information).  Instead of grouping on the ID columns, like in the previous 
example, we are grouping on the name columns.  This has no effect on the results since each ID 
correlates directly with each territory name or sales person name (even though the sales person name is 
broken up into two columns).  If we tried to group on the ID columns instead of the name columns 
using: 
SELECT  
 ST.Name AS [Territory Name],  
 P.FirstName AS [SalesPerson First Name],  
 P.LastName AS [SalesPerson Last Name],  
 SUM(TotalDue) AS [Total 2006 Sales] 
FROM Sales.SalesOrderHeader SOH 
INNER JOIN Sales.SalesPerson SP 
ON SP.BusinessEntityID = SOH.SalesPersonID 
INNER JOIN Person.Person P 
ON P.BusinessEntityID = SP.BusinessEntityID 
INNER JOIN Sales.SalesTerritory ST 
ON ST.TerritoryID = SOH.TerritoryID 
WHERE OrderDate BETWEEN '1/1/2006' AND '12/31/2006' 
GROUP BY SOH.TerritoryID, SOH.SalesPersonID 
ORDER BY 1,2 
We would receive a similar error message to the error message we have been discussing throughout this 
section: 
Msg 8120, Level 16, State 1, Line 2 
Column 'Sales.SalesTerritory.Name' is invalid in the select list because it is not contained 
in either an aggregate function or the GROUP BY clause. 
SQL doesn’t necessarily know that the Name column in Sales.SalesTerritory is a direction relation to the 
TerritoryID.  So, as it evaluates each clause, it identifies an error and outputs it to the screen.  Become 
familiar with this error message; knowing it well can help reduce the time it takes you to debug each 
incorrect query. 
Let’s look another example.  Suppose we want to know how many products fall into each product 
subcategory.  This would be a fairly simple query: 
SELECT PS.Name AS [Subcategory Name], COUNT(*) AS [Products in Subcategory] 
FROM Production.ProductSubcategory PS 
INNER JOIN Production.Product P 
ON PS.ProductSubcategoryID = P.ProductSubcategoryID 
GROUP BY PS.Name 

T-SQL Querying Guide  © The Knowlton Group, LLC  64 | P a g e    
We simply start with Production.ProductSubcategory and INNER JOIN to Production.Product.  Making 
the assumption that we wish to ignore any product with a NULL ProductSubcategoryID, we could have 
placed Production.Product in the FROM clause.  Order does not matter in this instance.  We then specify 
that we wish to group by the name of the subcategory and then count the number of rows in each 
group.   
Expanding upon the previous example, let’s say now that instead of finding the number of products per 
subcategory, we wish to find the number of products per main category.  To do this, we simply add an 
INNER JOIN and modify a few clauses: 
SELECT PC.Name AS [Category Name], COUNT(*) AS [Products in Category] 
FROM Production.ProductSubcategory PS 
INNER JOIN Production.Product P 
ON PS.ProductSubcategoryID = P.ProductSubcategoryID 
INNER JOIN Production.ProductCategory PC 
ON PC.ProductCategoryID = PS.ProductCategoryID 
GROUP BY PC.Name 
If we wanted to see the number of products per subcategory but also include the category name in the 
code to improve the readability of the results, we would write: 
SELECT  
 PC.Name AS [Category Name],  
 PS.Name AS [Subcategory Name],  
 COUNT(*) AS [Products in Category] 
FROM Production.ProductSubcategory PS 
INNER JOIN Production.Product P 
ON PS.ProductSubcategoryID = P.ProductSubcategoryID 
INNER JOIN Production.ProductCategory PC 
ON PC.ProductCategoryID = PS.ProductCategoryID 
GROUP BY PC.Name, PS.Name 
ORDER BY 1,2 
In the end, SQL is a means to analysis.  Analysis typically involves grouping and stratifying results by a 
few important concepts or dimensions.  This, in fact, is one of the main premises behind data 
warehousing.  Grouping then is an integral component to writing valuable SQL code.  Be sure to 
understand how SQL is grouping the data and how to write error-free SELECT statements leveraging the 
GROUP BY clause. 
Lab 10: Grouping with the GROUP BY Clause 
1) In the Person.Person table, how many people are associated with each PersonType? 
2) Using only one query, find out how many products in Production.Product are the color “red” 
and how many are “black”. 
3) Using Sales.SalesOrderHeader, how many sales occurred in each territory between July 1, 2005 
and December 31, 2006?  Order the results by the sale count in descending order. 

T-SQL Querying Guide  © The Knowlton Group, LLC  65 | P a g e    
4) Expanding on the previous example, group the results not by the TerritoryID but by the name of 
the territory (found in the Sales.SalesTerritory table). 
5) Using the Book, BookAuthor, Author and/or Publisher tables, identify how many books each 
author either wrote or co-authored. 

T-SQL Querying Guide  © The Knowlton Group, LLC  66 | P a g e    
Section 9: Filtering Groups with HAVING Clause 
The HAVING clause is the last clause in the SQL SELECT statement.  It behaves very similarly to the 
WHERE clause, in fact.  While a WHERE clause filters out individual rows, the HAVING clause filters out 
groups.  A HAVING clause can NEVER be employed without the SELECT statement using the GROUP BY 
clause.  Since the HAVING clause filters groups out of the results, it follows that the GROUP BY clause 
must exist to define those groups in advance of them being filtered.   
Given the addition of the HAVING clause, our final SELECT statement general form now looks like: 
SELECT [Column 1], [Column 2], ... , [Column N] 
FROM [Database Name].[Schema Name].[Table Name] 
WHERE [Column Name] {Comparison Operator} {Filter Criteria} 
GROUP BY [Column Name] 
HAVING {[Aggregate Function]} {Comparison Operator} {Filtering Critera} 
ORDER BY {[Column Name], [Column Alias], [Column Ordinal]} [ASC/DESC] 
It will be helpful to have an example to look at as we describe the HAVING clause: 
SELECT ST.Name AS [Territory Name], SUM(TotalDue) AS [Total Sales - 2006] 
FROM Sales.SalesOrderHeader SOH 
INNER JOIN Sales.SalesTerritory ST 
ON ST.TerritoryID = SOH.TerritoryID 
WHERE OrderDate BETWEEN '1/1/2006' AND '12/31/2006' 
GROUP BY ST.Name 
HAVING SUM(TotalDue) > 4000000 
In the previous example, we group all of the sales from 2006 by the territory the sale was associated 
with.  We then filter out each territory based on the total sales amount in 2006.  If the territory had less 
than or equal to $4 million in sales for 2006, the territory was excluded from the results.   
The HAVING clause first requires an aggregate function.  Since we are going to eliminate groups based 
on their counts, maxes, mins, etc. then it follows we must specify which aggregate function will be using 
to filter the groups by.  From there, we identify which comparison operator we will be using for this 
group filtering (see Section 4 for a review of the comparison operators).  Lastly, we specify the criteria 
we will be filtering by; in this case, $4 million is the number we have chosen to filter on.   
Suppose we wanted to return a list of only those product subcategories that contained at least 15 
products.  To complete this query we would execute: 
SELECT PS.Name AS [Subcategory Name], COUNT(*) AS [Product Count] 
FROM Production.Product P 
INNER JOIN Production.ProductSubcategory PS 
ON PS.ProductSubcategoryID = P.ProductSubcategoryID 
GROUP BY PS.Name 
HAVING COUNT(*) > 15 

T-SQL Querying Guide  © The Knowlton Group, LLC  67 | P a g e    
Or, if wanted to follow the instructions extremely explicitly and return ONLY the subcategory name, we 
could remove the “COUNT(*)” from the SELECT clause and execute: 
SELECT PS.Name AS [Subcategory Name] 
FROM Production.Product P 
INNER JOIN Production.ProductSubcategory PS 
ON PS.ProductSubcategoryID = P.ProductSubcategoryID 
GROUP BY PS.Name 
HAVING COUNT(*) > 15 
You do not need to have the aggregate function you are filtering on in the HAVING clause specified in 
the SELECT clause.   
In another example, suppose you wish to find only those departments within the company that 
currently have at least 8 employees.  Using the HumanResources.vEmployeeDepartment view, we could 
use the query: 
SELECT Department AS [Department Name], COUNT(*) AS [Employees in Department] 
FROM HumanResources.vEmployeeDepartment 
GROUP BY Department 
HAVING COUNT(*) > 8 
Like the WHERE clause, we can also use the logical operators (AND and OR) to add complexity to the 
HAVING clause.  We could modify the previous query, for example, to include all departments that had 
between 6 and 10 employees: 
SELECT Department AS [Department Name], COUNT(*) AS [Employees in Department] 
FROM HumanResources.vEmployeeDepartment 
GROUP BY Department 
HAVING COUNT(*) BETWEEN 6 AND 10 
Going back to the Sales.SalesOrderHeader table, let’s write a query that groups all sales during the year 
2006 by sales person for each sale that had a sales person assigned to it.  We want to see the total 
amount that each sales person sold and the number of sales each sales person was responsible for.  To 
complete this query without any group filtering (yet), we could type: 
SELECT  
 SOH.SalesPersonID,  
 SUM(TotalDue) AS [Total Sales ($) - 2006],  
 COUNT(*) AS [Total Sales (#) - 2006] 
FROM Sales.SalesOrderHeader SOH 
WHERE SOH.SalesPersonID IS NOT NULL 
 AND OrderDate BETWEEN '1/1/2006' AND '12/31/2006' 
GROUP BY SOH.SalesPersonID 
Now that we have the groupings complete, we will use the HAVING clause to only return those sales 
people who had at least $2 million in sales amount and at least 75 completed sales: 
SELECT  
 SOH.SalesPersonID,  

T-SQL Querying Guide  © The Knowlton Group, LLC  68 | P a g e    
 SUM(TotalDue) AS [Total Sales ($) - 2006],  
 COUNT(*) AS [Total Sales (#) - 2006] 
FROM Sales.SalesOrderHeader SOH 
WHERE SOH.SalesPersonID IS NOT NULL 
 AND OrderDate BETWEEN '1/1/2006' AND '12/31/2006' 
GROUP BY SOH.SalesPersonID 
HAVING SUM(TotalDue) > 2000000  
 AND COUNT(*) > 75 
Notice the AND logical operator separating the two filtering criteria in the HAVING clause.  The same 
process of using logical operators in the WHERE clause can be applied to the HAVING clause.  
If you fully comprehended the WHERE clause, then the HAVING clause should not give you much 
trouble.  The key difference to note is that the WHERE clause filters out rows based on column values, 
while the HAVING clause filters out groups based on aggregate functions.  If you can understand this 
subtle difference, the HAVING clause will come easy and should give you little trouble. 
Lab 11: Filtering Groups with the HAVING Clause 
1) Find the total sales by territory for all rows in the Sales.SalesOrderHeader table.  Return only 
those territories that have exceeded $10 million in historical sales.  Return the total sales and 
the TerritoryID column. 
2) Using the query from the previous question, join to the Sales.SalesTerritory table and replace 
the TerritoryID column with the territory’s name. 
3) Using the Production.Product table, find how many products are associated with each color.  
Ignore all rows where the color has a NULL value.  Once grouped, return to the results only 
those colors that had at least 20 products with that color. 
4) Starting with the Sales.SalesOrderHeader table, join to the Sales.SalesOrderDetail table.  This 
table contains the line item details associated with each sale.  From Sales.SalesOrderDetail, join 
to the Production.Product table.  Return the Name column from Production.Product and assign 
it the column alias “Product Name”.  For each product, find out how many of each product was 
ordered for all orders that occurred in 2006.  Only output those products where at least 200 
were ordered. 
5) Find the first and last name of each customer who has placed at least 6 orders between July 1, 
2005 and December 31, 2006.  Order your results by the number of orders placed in descending 
order.  (Hint: You will need to join to three tables – Sales.SalesOrderHeader, Sales.Customer, 
and Person.Person.  You will use every clause to complete this query). 

T-SQL Querying Guide  © The Knowlton Group, LLC  69 | P a g e    
Section 10: Built-In SQL Server Functions 
SQL Server has many built-in functions that can improve the efficiency of your code and reduce 
unnecessary development.  An entirely book could be dedicated to the vast number of functions 
available to us when developing queries.  In this guide, however, we will focus on some of the most 
commonly used built-in function.  That is not to say that every function you may ever want to use is 
contained in this guide; merely that this guide will walk you through the most common functions and 
give you the skills to be able to then immediately use any new function you learn of while completing 
SELECT statements.  For the complete built-in function documentation, navigate to 
http://msdn.microsoft.com/en-us/library/ms174318(v=sql.105).aspx. 
String Built-In Functions 
Due to the sheer number of functions we will be discussing, this section will contain the name and 
description of the function, its basic syntax, and then at least one example of the function in use.   
Function: LEFT()  
Syntax: LEFT(string, integer_value) 
Description: Returns the specified number of left most characters, from the integer_value argument, 
from the specified string. 
Example 1: The first five characters from each LastName from Person.Person 
SELECT LEFT(LastName, 5) 
FROM Person.Person 
Example 2: The first five characters for each FirstName from Person.Person with a column alias: 
SELECT LEFT(FirstName, 5) AS [First Five FirstName Characters] 
FROM Person.Person 
Function: RIGHT() 
Syntax: RIGHT(string, integer_value) 
Description: Returns the specified number of right most characters, from the integer_value argument, 
from the specified string. 
Example 1: The last five characters from each product name in Production.Product 
SELECT RIGHT(Name, 5) AS [Last Five Characters of Product Name] 
FROM Production.Product 
Function: SUBSTRING() 
Syntax: SUBSTRING(string_value, starting_integer, substring_length) 
Description: Takes a substring of a base string starting at the integer index specified by the 
starting_integer argument and is substring_length characters in length. 
Example 1: From the string “T-SQL Training Guide”, find the string that is between the third and fifth 
characters (inclusive): 
SELECT SUBSTRING('T-SQL Training Guide', 3, 3) 

T-SQL Querying Guide  © The Knowlton Group, LLC  70 | P a g e    
For those with a programming background, it is important to note that T-SQL treats the first index of the 
string “T-SQL Training Guide” as the value 1 NOT 0.  In most object-oriented programming languages or 
even general scripting languages, the first character of a string or array typically is the index value 0.  In 
T-SQL, the initial index value is 1.  Be aware of this when developing your queries. 
Example 2: Find, from each FirstName from Person.Person, the third through seventh characters in each 
name.  Also display the unmodified FirstName in the query. 
SELECT FirstName, SUBSTRING(FirstName, 3, 5) 
FROM Person.Person 
Notice how the length of the FirstName affects the resulting substring.   
Function: CHARINDEX() 
Syntax: CHARINDEX(searching_for_string, string [, starting_index]) 
Description: The CHARINDEX() function looks through the string argument to find the starting index of 
the searching_for_string argument somewhere in the string.  The starting_index optional argument 
allows you to tell the function you wish to start looking for the searching_for_string argument at the 
specified index location in string. 
Example 1: Find the first instance of the letter “Q” in the string “T-SQL Training Guide” 
SELECT CHARINDEX('Q', 'T-SQL Training Guide') 
Example 2: In each FirstName of Person.Person, find the location of the first “m” in each name: 
SELECT CHARINDEX('m', FirstName, 0), FirstName 
FROM Person.Person 
By itself this function has minimal usage, however when we discuss nesting functions at the end of this 
section, it will be incredibly helpful when combined with other string functions. 
Function: LTRIM() 
Syntax: LTRIM(string or column) 
Description: Returns a string with the leading spaces removed. 
Example 1:  Remove the leading spaces from the string “   This is a test” 
SELECT LTRIM('    This is a test') 
The RTRIM() function performs nearly an identical task as the LTRIM() function except the RTRIM() 
function removes the trailing spaces (at the end of the string) instead of the leading spaces like the 
LTRIM() function handles. 
Function: REPLACE() 
Syntax: REPLACE(string_expression, string_pattern, string_replacement) 

T-SQL Querying Guide  © The Knowlton Group, LLC  71 | P a g e    
Description: The REPLACE() function searches through the string_expression argument for the 
string_pattern string.  For each instance that the string_pattern is found in string_expression, the 
string_pattern characters are replaced with the string_replacement value. 
Example 1: Replace each “T” from the string “T-SQL Training Guide” with the letter “X” 
SELECT REPLACE('T-SQL Training Guide', 'T', 'X') 
Example 2: Replace every “a” from each FirstName in Person.Person with the phrase “REDACTED” 
SELECT FirstName, REPLACE(FirstName, 'a', 'REDACTED') 
FROM Person.Person 
Function: LOWER() 
Syntax: LOWER(string or column) 
Description: Converts every character in the string or column to lowercase letters. 
Example 1: Make every character in the FirstName column of Person.Person lowercase: 
SELECT FirstName, LOWER(FirstName) 
FROM Person.Person 
Converting strings you are comparing to either all uppercase or all lowercase is a helpful trick to avoiding 
the case-sensitivity issues presented. 
Function: UPPER() 
Syntax: UPPER(string or column) 
Description: Converts every character in the string or column to uppercase letters. 
Example 1: Make every character in the LastName column of Person.Person uppercase: 
SELECT LastName, UPPER(LastName) 
FROM Person.Person 
Function: LEN() 
Syntax: LEN(string or column) 
Description: Returns the length (number of characters) of the string or column expression 
Example 1: How many characters long is the string “T-SQL Training Guide” 
SELECT LEN('T-SQL Training Guide') 
Example 2: How many characters is each FirstName in the Person.Person table? 
SELECT FirstName, LEN(FirstName) 
FROM Person.Person 
Earlier in this section, we briefly mentioned the concept of nesting functions.  This is where the result of 
a function is used as in the input argument for another function – hence the term “nesting”.  For 
example, if we wanted to trim the leading spaces from the string “     T-SQL Training Guide” and then, 
once it has been trimmed, return the length of the resulting string, we would execute: 

T-SQL Querying Guide  © The Knowlton Group, LLC  72 | P a g e    
SELECT LEN(LTRIM('     T-SQL Training Guide')) 
If we break this down step-by-step, we can see how the result evolves.  Like the order of operations, 
function evaluation works “inside out”.  That is, the inner most function is evaluated first, then the 
second inner most function, and so on until the outer function is finally evaluated.  In the previous 
example, the first function to be evaluated is “LTRIM(‘     T-SQL Training Guide’)”.  Once that result has 
been determined, then the LEN() function is applied to the LTRIM() result.  We can show the progression 
of this nested function in the query below: 
SELECT  
 '     T-SQL Training Guide' AS OriginalExpression,  
 LTRIM('     T-SQL Training Guide') AS TrimmedExpression, 
 LEN(LTRIM('     T-SQL Training Guide')) AS LengthOfTrimmedExpression 
By viewing the results, you can see the step-by-step process unfold of the nested function evaluation. 
Let’s look at another, larger nested function.  Suppose we wanted to take the substring of “This is a 
sample expression we are going to manipulate.” starting at the seventh character and ending with the 
twenty-second character (including the twenty-second character).  From there, we will make each 
character uppercase.  After that, we will replace each ‘E’ with the letter ‘x’.  So, in all, we are going to 
utilize three functions in a single expression. 
SELECT  
 REPLACE(UPPER(SUBSTRING('This is a sample expression we are going to manipulate.', 
7, 16)), 'E', 'x') 
The result is a pretty lengthy expression.  Let’s do as we did with the previous example and break this 
nested function down into its individual steps: 
SELECT  
 'This is a sample expression we are going to manipulate.' AS OriginalString,  
 (SUBSTRING('This is a sample expression we are going to manipulate.', 7, 16)) AS 
SubstringValue, 
 UPPER(SUBSTRING('This is a sample expression we are going to manipulate.', 7, 16)) 
AS UpperSubstring, 
 REPLACE(UPPER(SUBSTRING('This is a sample expression we are going to manipulate.', 
7, 16)), 'E', 'x') AS FinalString 
By reading the results left-to-right, you can see how the nested function evaluation process is taking 
place.  The substring is first identified, then the substring is capitalized, and lastly each “E” in the 
substring is replaced with “x”.   
Understanding how nested functions are evaluated is not altogether complicated, rather it is being able 
to unravel what the inner-most function is and then working outward from there.  Whenever you are 
compiling a larger nested function, it may be beneficial to type out each individual step – like in the last 
example query – to understand each level of the nested function.  Below are several practice problems 

T-SQL Querying Guide  © The Knowlton Group, LLC  73 | P a g e    
for you to complete to reinforce your understanding of string functions and the basics of nested 
functions. 
Lab 12: String Functions and Nested Functions 
1) Return the first eight characters of the string “This is a basic string”. 
2) Return the last six characters of the string “This is another string” 
3) Find the index (integer location) of the first instance of the letter “e” in each product name from 
the Production.Product table. 
4) Find the substring of the territory name from Sales.SalesTerritory starting at the third character 
and lasting four characters in length. 
5) Starting with the string “This is a slightly longer string”, find the last eight characters and then, 
from that result, find the first four characters.  In other words, find the first four characters of 
the last eight characters from the string “This is a slightly longer string”. (Hint: use a nested 
function for this query) 
6) Find the string that results from finding all characters to left of and including the first “e” in each 
FirstName from Person.Person.  Only return those results where the resulting string is not blank. 
(Hint: this will be challenging.  Use a nested function in the SELECT statement and then that 
same nested function in the WHERE clause) 
Date and Time Built-In Functions 
Like the set of built-in string functions, there are many date and time functions.  However, 
understanding the date and time functions can be incredibly useful.  Especially in later levels of SQL 
training, using parameters based on date functions can drastically improve the quality of your code. 
Below are some of the most commonly used date and time functions and at least one example of each. 
Function: GETDATE() 
Syntax: GETDATE() 
Description: Returns the current date and time of the computer on which SQL Server is running. The 
date and time is returned with the form: YYYY-MM-DD HH:MM:SS.nnn. The GETDATE() function is 
accurate to within .00333 seconds. 
Example:  
SELECT GETDATE() 
Function: SYSDATETIME() 
Syntax: SYSDATETIME() 
Description: Returns the current date and time of the computer on which SQL Server is running.  This is 
a slightly higher precision function than GETDATE().  The form returned is YYYY-MM-DD 
HH:MM:SS.nnnnnnn.  The SYSDATETIME() function is accurate to 100 nanoseconds. 
SELECT SYSDATETIME() 

T-SQL Querying Guide  © The Knowlton Group, LLC  74 | P a g e    
There are several other date and time functions that return slightly different variations of the current 
time.  These include time zone offsets and UTC date and times.  The two most commonly used at this 
stage in SQL development will be either the GETDATE() or the SYSDATETIME() function.  If you 
information on the other functions, navigate to http://msdn.microsoft.com/en-
us/library/ms186724(v=sql.105).aspx. 
Function: DATEDIFF() 
Syntax: DATEDIFF(datepart, start_date, end_date) 
Description: The DATEDIFF() function with find the datepart time gap between the start_date and 
end_date values.  There are many dateparts that can be used.  These dateparts are captured in the table 
below: 
Datepart 
Abbreviation 
year 
yy, yyyy 
quarter 
qq, q 
month 
mm, m 
dayofyear 
dy, y 
day 
dd, d 
week 
wk, ww 
weekday 
dw, w 
hour 
hh 
minute 
mi, n 
second 
ss, s 
millisecond 
ms 
microsecond 
mcs 
nanosecond 
ns 
Example 1: Find the number of years between July 7, 1927 and May 14, 1929. 
SELECT DATEDIFF(YEAR, '7/1/1927', '5/14/1929') 
Example 2: Find the number of months between July 18, 1990 and May 14, 1991. 
SELECT DATEDIFF(mm, '7/18/1990', '5/14/1991') 
Example 3: Find the number of hours between 5:14 PM on June 3, 2008 and 12:15 PM on June 5, 2008. 
SELECT DATEDIFF(HOUR, '6/3/2008 17:14:00', '6/5/2008 12:15:00') 
Function: DATEADD() 
Syntax: DATEADD(datepart, number, date_value) 
Description: Add to the date_value date or time the number argument value in the specified datepart 
intervals. 

T-SQL Querying Guide  © The Knowlton Group, LLC  75 | P a g e    
Example 1: What was the date thirty days after April 7, 2011? 
SELECT DATEADD(DAY, 30, '4/7/2011') 
Example 2: What date will it be thirty days from today? 
SELECT DATEADD(DAY, 30, GETDATE()) 
Example 3: What date was it three weeks before January 26, 2013? 
SELECT DATEADD(WK, -3, '1/26/13') 
The DATEADD() function will accept a negative value for the number argument.  This allows you to 
subtract a given number of dateparts from the time specified in the last argument.   
Function: YEAR() 
Syntax: YEAR(date) 
Description: Returns the year of the date argument. 
Example 1: What is the year value for the current date? 
SELECT YEAR(SYSDATETIME()) 
Example 2: What year was it 8,456 days ago? 
SELECT YEAR(DATEADD(DAY, -8456, GETDATE())) 
In the previous nested function example, we first found what the date was 8,456 days ago.  Then, given 
the returned date value, we extracted the year datepart from the result. 
Function: MONTH() 
Syntax: MONTH(date) 
Description: Returns the month value of the given date argument. 
Example: What is the month value for November 11, 1945? 
SELECT MONTH('11/11/1945') 
Like MONTH() and YEAR(), there also exists a DAY() function which returns the day datepart from the 
date value.   
There are not as many date and time functions as there are string functions.  However, becoming 
familiar with date and time functions is a necessary skill in becoming a proficient developer of SQL 
queries.  Below are some practice problems that will help you gain experience using date and time 
functions. 

T-SQL Querying Guide  © The Knowlton Group, LLC  76 | P a g e    
Lab 13: Date and Time Built In Functions  
1) What is the month datepart for June 12, 2011? 
2) What is the year datepart for November 20, 1992? 
3) What day was it seventy-four days ago? 
4) What is the current date? Complete this problem using at least two different date functions. 
5) How many days are between April 17, 1996 and September 4, 2001? 
6) How many months are between December 25, 1993 and the date that is 2,719 days before 
today’s date? 
NULL Handling Functions 
Though the majority of the functions will explore lie under the concept of “Date and Time” or “String” 
functions, there are two other important functions we will look at: COALESCE() and NULLIF().  Both of 
these functions are used to handle NULL values in your result set.  The following brief section will show 
examples of both functions being employed. 
Function: COALESCE() 
Syntax: COALESCE(expression_1, expression_2, …, expression_n) 
Description: The COALESCE() function receives expressions as inputs and, based on the order of these 
inputs, returns the first non-NULL value.  These expression arguments can be columns, functions, 
mathematical expressions, strings or any other type of SQL value. 
Example 1: Retrieve the FirstName, MiddleName, and LastName columns from Person.Person.  
However, anytime a NULL is present in the MiddleName column, replace it with a blank space. 
SELECT FirstName, COALESCE(MiddleName, '') AS MiddleName, LastName 
FROM Person.Person 
If you look at the results of the previous query, you will notice that the MiddleName no longer contains 
any NULL values in the output.  Instead, you find many blank spaces.  This indicates to us that the 
COALESCE() function worked properly.  To reiterate again, blank spaces are not equal to NULL values! 
Example 2: Return the product name and the product color from Production.Product.  If the product 
does not have a value in the Color column, return the string “No Color Listed”. 
SELECT Name AS ProductName, COALESCE(Color, 'No Color Listed') AS Color 
FROM Production.Product 
In place of the NULL value, the results now show the string “No Color Listed”.  If you were exporting the 
results of the query to an Excel file and providing the data to business users, replacing NULL values with 
either blanks or some other meaningful text is often more valuable and more easily understood by the 
recipient. 

T-SQL Querying Guide  © The Knowlton Group, LLC  77 | P a g e    
Example 3: Using the Person.Person table, return the MiddleName column.  If the MiddleName column 
is NULL, then return the Title column.  If the Title column is NULL then return the Suffix column.  If the 
Suffix column is NULL return the FirstName column. 
SELECT COALESCE(MiddleName, Title, Suffix, FirstName) 
FROM Person.Person 
This is an unrealistic and trivial example, but it shows the capability of multiple expressions being 
evaluated in the COALESCE() function.  Since the COALESCE() function returns the first non-NULL value 
of inputs, the function can evaluate multiple inputs at once.  In the above example, the function looks to 
see if the MiddelName value is NULL.  If it is non-NULL then the MiddleName value is returned.  If the 
MiddleName value is NULL, then the function looks at the Title value.  If the Title value is non-NULL then 
that value is returned, else the function looks at whether or not the Suffix column value is NULL.  This 
iterative process completes until the function has exhausted all possible arguments. 
Function: NULLIF() 
Syntax: NULLIF(expression_1, expression_2) 
Description: Returns a NULL value if the two expressions are equal. 
Example 1: Return a NULL value if the BillToAddressID and ShipToAddressID in the table 
Sales.SalesOrderHeader are equal. 
SELECT NULLIF(BillToAddressID, ShipToAddressID) 
FROM Sales.SalesOrderHeader 
The NULLIF() function can, at times, provide a simpler solution to comparing value in a SELECT statement 
without requiring CASE statements – a subject we will talk about in later sections. 
Lab 14: NULL Handling Functions 
1) If the Title column of Person.Person is NULL then return the string “No Title Listed”. 
2) If the MiddleName column of Person.Person is NULL then return the string “No Middle Name 
Listed”. 
3) If the MiddleName column is NULL then return the FirstName and LastName concatenated.  If 
the MiddleName is non-NULL then return the FirstName, MiddleName and LastName 
concatenated. 
4) Using the Production.Product table, if the MakeFlag and FinishedGoodsFlag columns are equal 
then return a NULL value. 

T-SQL Querying Guide  © The Knowlton Group, LLC  78 | P a g e    
Section 11: SQL Server Data Types & Type Casting 
Every column, variable, expression or parameter in SQL Server has an associated data type.  This data 
type represents the type of data that the value can store: integer data, money data, character data, etc.  
We have found in the aggregate function section that certain functions only apply to values with certain 
data types.  For example, we cannot sum the values of a FirstName column.  Logically, this makes sense.  
Sometimes, however, we need to modify the data type of a column or value in order to return the value 
in a form that we want.  For example, the GETDATE() function returns the date in the datetime data type 
format.  This data type contains both date and time values.  Suppose we only wanted to return the date 
part of the result and not the timestamp component.  To complete this request, we can use one of two 
functions to perform a task known as type casting.  This is where you alter the data type associated with 
an expression or value.  
SELECT CAST(GETDATE() AS DATE) 
SELECT CONVERT(DATE, GETDATE()) 
The CONVERT() and CAST() functions complete the same task though CAST() is the more ANSI-standard 
version of the casting functions.  If you look at the results, you will notice that the timestamp is removed 
from the result of GETDATE().  This is because the GETDATE() result, returned as a datetime data type, 
has been casted as a date data type.  This subtle change in data type removes the timestamp since the 
timestamp is not a component of the date data type.   
You may not be able to convert one data type to another data type.  For example, trying to convert a 
column containing non-Unicode strings, the VARCHAR data type, to a DECIMAL data type may not be 
possible if the VARCHAR column contains values such as word, phrases or letters.  If the VARCHAR 
column contains all numeric values (despite the fact that the column is defined as a VARCHAR data 
type), then the column could be cast to a numeric data type.  If you tried, for example, to convert the 
FirstName column of Person.Person to an integer, INT, data type, you would receive the following error: 
SELECT CAST(FirstName AS INT) 
FROM Person.Person 
Error Message: 
Msg 245, Level 16, State 1, Line 1 
Conversion failed when converting the nvarchar value 'Syed' to data type int. 
The database engine is going through each value in the column and trying to convert the value to the 
INT data type.  It runs into the NVARCHAR value “Syed” – an individual’s name – that clearly cannot be 
converted to an integer.  This - or a similar - error message is common when you attempt to type cast 
from one data type to another data type that SQL will not allow. 
One common area where type casting may be required is during the joining process.  You may be trying 
to join from an NVARCHAR column to a VARCHAR column.  It is possible that you will need to cast one of 
those data types to the matching data type in order to resolve the join.  Observe the error message for 

T-SQL Querying Guide  © The Knowlton Group, LLC  79 | P a g e    
indicating of type mismatches and review your code for potential locations that require type casting 
within the SELECT statement. 
Below is a table containing the common data types in SQL Server 2008R2.  For more information on data 
types and type casting, navigate to the following links: http://msdn.microsoft.com/en-
us/library/ms187928(v=sql.105).aspx and http://msdn.microsoft.com/en-
us/library/ms187752(v=sql.105).aspx. 
Data Type 
Description 
bigint 
An 8 byte integer ranging from -9,223,372,036,854,775,808 to 
9,223,372,036,854,775,807 
numeric 
A numeric data type with fixed precision and scale 
bit 
An integer that can be either NULL, 0, or 1. 
smallint 
A 2 byte integer ranging from  
decimal 
A numeric data type with fixed precision and scale (functionally equivalent to 
numeric) 
smallmoney 
A 4 byte representation of currency ranging from - 214,748.3648 to 
214,748.3647 
int 
A 4 byte integer ranging from -2,147,483,648 to 2,147,483,647 
tinyint 
A 1 byte integer ranging from 0 to 255 
money 
An 8 byte representation of currency ranging from -922,337,203,685,477.5808 
to 922,337,203,685,477.5807 
float 
An approximate numeric data type that falls within the ranges: - 1.79E+308 to -
2.23E-308, 0 and 2.23E-308 to 1.79E+308 
real 
An approximate numeric data type that falls within the ranges: - 3.40E + 38 to -
1.18E - 38, 0 and 1.18E - 38 to 3.40E + 38.  The real data type is equivalent to 
float(24). 
date 
A data type storing date values.  They can be stored and viewed in a variety of 
formats.  The date data type has the range: 0001-01-01 through 9999-12-31 
datetimeoffset 
Combines the date and time into a time-zone aware value. 
datetime2 
A data type storing a date and time component with more precision than the 
datetime data type. 
smalldatetime 
A smaller, subset data type of datetime.  It allows values ranging from 1900-01-
01 through 2079-06-06. 
datetime 
Defines a date and time with fractional seconds.   
time 
Defines the time of data without time zone awareness. 
char 
A string of a defined fixed length (i.e. char(48) = 48 character, 48 bytes) 
varchar 
A string that cannot exceed the defined length yet whose storage space is 
based on the length of the value + 2 bytes.  (i.e. VARCHAR(20) means that no 
value in the column can exceed 20 characters.  Unless CHAR(20), if a value in 
the common is six characters long then the storage for the VARCHAR column 
will only use 8 bytes for that value.) 
text 
Variable length, non-Unicode string that cannot exceed 2,147,483,647 
characters. 

T-SQL Querying Guide  © The Knowlton Group, LLC  80 | P a g e    
nchar 
Fixed length Unicode string data types between 1 and 4,000 characters.  
Storage space is two times the length of the defined data type (i.e. nchar(4) 
stores 8 bytes per value). 
nvarchar 
Variable length, Unicode string data.  Similar to VARCHAR but with Unicode 
values. 
ntext 
Variable length, Unicode string data that cannot exceed 1,073,741,823 
characters. 
Lab 15: SQL Server Data Types & Type Casting 
1) Write the SQL SELECT statement that returns the FirstName column of Person.Person casted as 
the VARCHAR data type. 
2) Write three expressions in a single SELECT statement: one that returns the results of 11 divided 
by 4.  The second column should return the result of 11 casted as float divided by 4 casted as 
float.  The final column should divide 11.0 by 4.0 (including the decimal point and trailing zero). 
3) Cast the FirstName column of Person.Person as the VARCHAR(3) data type.  What happens?  
Why? 
4) Many of the values in the Size column of the Production.Product table contain numeric values.  
Write a SELECT statement that returns the Size column casted as the integer data type.  What is 
the result?  Why? 
5) Using the same SELECT statement that you developed in problem 4, add the WHERE clause, 
“WHERE ISNUMERIC(Size) = 1”.  What is the result of the query now? Why? (Hint: use the MSDN 
articles to find how the ISNUMERIC() function is used). 

T-SQL Querying Guide  © The Knowlton Group, LLC  81 | P a g e    
Section 12: Table Expressions 
A table expression can be thought of as a virtual table defined by a query.  The scope of the results of a 
table expression are confined to a single SELECT statement.  You can think of a table expression as an 
imaginary table that you create with columns and values that can be used in a SELECT statement.  We 
will discuss two types of table expressions: the derived table and the common table expression (CTE). 
Derived Tables 
A derived table is a table expression that appears in the FROM clause of a SELECT statement.  Instead of 
specifying a table or view name in the FROM clause, like we have in past examples, you will define a 
simple query whose results will take the place of a table or view name.  Let’s look at an example: 
SELECT * 
FROM ( 
 SELECT BusinessEntityID, FirstName, LastName 
 FROM Person.Person 
 ) AS PersonName 
Notice how we are not placing a table name or view name after the FROM clause.  Instead, we define a 
derived table using a basic query – in this case, our derived table contains the BusinessEntityID, 
FirstName, and LastName columns from Person.Person.  We must always give a derived table a table 
alias; in this case, we give the derived table the name PersonName.  So you are aware of the type of 
error message you will receive if no table alias is given, execute the query below and see the error 
message that results: 
SELECT * 
FROM ( 
    SELECT BusinessEntityID, FirstName, LastName 
    FROM Person.Person 
 ) 
Error Message: 
Msg 102, Level 15, State 1, Line 5 
Incorrect syntax near ')'. 
Unfortunately, the error message is vague and, aside from giving us the location of the error, is relatively 
unhelpful in specifically identifying the problem.   
Derived tables have three other rules that must be met: all columns must have names, all columns must 
have unique names, and the ORDER BY clause is not allowed unless TOP is included in the SELECT clause.  
The first two should intuitively make sense: a table is not allowed to have a column without a name or 
you would have no way of querying that nameless column.  A table also cannot have multiple columns 
with the same name.  The database engine wouldn’t be able to tell which of the identically named 
columns you wish to return.  This, in a way, presents itself when joining tables.  When two tables have 

T-SQL Querying Guide  © The Knowlton Group, LLC  82 | P a g e    
the same column name, we must specify the table alias in front of the column name else we receive 
error messages indicating column ambiguity. 
There are several reasons as to why we may use a derived table.  Remember back to when we discussed 
that you could not use a column alias in the WHERE clause of your SELECT statement due to the order in 
which SQL clauses are processed by the database.  A derived table can help reduce the amount of typing 
required to filter on functions that employ column aliases.  For example, suppose I want to return how 
many orders were placed each year from the Sales.SalesOrderHeader table.  However, we only want to 
return this information for the year 2006.  We could avoid a derived table and complete the query this 
way: 
SELECT YEAR(OrderDate) AS OrderYear, COUNT(*) AS SalesCount 
FROM Sales.SalesOrderHeader 
WHERE YEAR(OrderDate) = '2006' 
GROUP BY YEAR(OrderDate) 
In the query above we are focused to constantly type out “YEAR(OrderDate)” in both the WHERE and 
GROUP BY clause.  However, with a derived table we could simplify the query slightly: 
SELECT OrderYear, COUNT(*) AS SalesCount 
FROM ( 
 SELECT YEAR(OrderDate) AS OrderYear, SalesOrderID 
 FROM Sales.SalesOrderHeader 
 ) SalesDetails 
WHERE OrderYear = '2006' 
GROUP BY OrderYear 
Now the “YEAR(OrderDate)” is only specified once as opposed to three times in the first query.  This may 
seem trivial with a function as simple as “YEAR(OrderDate)”, but, with larger nested functions, simplicity 
and reduced redundancy in your code can save you quite a bit of time and actually improve the query’s 
performance.   
Some functions, like the ranking functions we will explore in section 15, are not allowed to be used in 
the WHERE clause.  In these instances, the use of a table expression is necessary if you which to filter 
based on the values of the ranking function.   
Let’s look at a few more examples of derived tables to build up our comfort.  Using a derived table 
return the BusinessEntityID, NationalIDNumber, year of the birth date, and year of the hire date for all 
rows where the year of the birth date is less than 1960.  We can complete this query using: 
SELECT * 
FROM ( 
 SELECT BusinessEntityID, NationalIDNumber, YEAR(BirthDate) AS BirthYear, 
YEAR(HireDate) AS HiredYear 
 FROM HumanResources.Employee 
 ) HR_Emp 
WHERE BirthYear < 1960 

T-SQL Querying Guide  © The Knowlton Group, LLC  83 | P a g e    
As you can see, we are applying a derived table with the YEAR() function for both the BirthDate and 
HireDate columns.  A column alias is applied to each column with the YEAR() function in the derived 
table, and the derived table itself is given the table alias HR_Emp.  We can now use the BirthYear alias 
from the derived table in the WHERE clause and restrict our result set based on the request. 
Using the same techniques, let’s return only those employees who were hired during or after the year 
2004. 
SELECT * 
FROM ( 
 SELECT BusinessEntityID, NationalIDNumber, YEAR(BirthDate) AS BirthYear, 
YEAR(HireDate) AS HiredYear 
 FROM HumanResources.Employee 
 ) HR_Emp 
WHERE HiredYear >= 2004 
Derived tables, like functions, can be nested within each other.  Let’s look at the following example: 
SELECT * 
FROM ( 
 SELECT BusinessEntityID, NationalIDNumber, BirthYear, YEAR(HireDate) AS HiredYear 
 FROM ( 
    SELECT BusinessEntityID, NationalIDNumber, YEAR(BirthDate) AS BirthYear, HireDate 
    FROM HumanResources.Employee 
    ) AS HR_Inner_Nested 
 ) AS HR_Outer 
Now, we have used a derived table in the FROM clause of another derived table.  This is perfectly 
acceptable and is often required for more complex queries.  Notice how in the outer derived table we 
are able to use the BirthYear column alias from the inner derived table.   
We could add WHERE clauses to the derived tables to create a more robust and involved query: 
SELECT * 
FROM ( 
 SELECT BusinessEntityID, NationalIDNumber, BirthYear, YEAR(HireDate) AS HiredYear 
 FROM ( 
    SELECT BusinessEntityID, NationalIDNumber, YEAR(BirthDate) AS BirthYear, HireDate 
    FROM HumanResources.Employee 
    ) AS HR_Inner_Nested 
 WHERE BirthYear < 1960 
 ) AS HR_Outer 
WHERE HiredYear > 2003 
Now, we have employed a WHERE clause twice without having to include the YEAR() function.  This 
simplicity will be incredibly helpful as we add more complexity to our queries in later sections. 
Just as we could use a derived table in the FROM clause, we can also use derived tables in joins.  Say we 
wanted to view, for each year we had sales, the total amount of sales revenue generated and the 
number of employees hired for each year.  This is a relatively complicated query, but it can be made 
easier with the use of derived tables: 

T-SQL Querying Guide  © The Knowlton Group, LLC  84 | P a g e    
SELECT Sales_by_Year.SalesYear, Sales_by_Year.TotalRevenue, Hires_by_Year.NewHireCount 
FROM ( 
 SELECT SalesYear, SUM(TotalDue) AS TotalRevenue 
 FROM ( 
    SELECT YEAR(OrderDate) AS SalesYear, TotalDue 
    FROM Sales.SalesOrderHeader 
    ) SalesDetails 
 GROUP BY SalesYear 
 ) Sales_by_Year 
LEFT OUTER JOIN ( 
 SELECT HireYear, COUNT(BusinessEntityID) AS NewHireCount 
 FROM ( 
    SELECT YEAR(HireDate) AS HireYear, BusinessEntityID 
    FROM HumanResources.Employee 
    ) AS EmployeeDetails 
 GROUP BY HireYear 
 ) AS Hires_by_Year 
ON Hires_by_Year.HireYear = Sales_by_Year.SalesYear 
ORDER BY 1 
Both the FROM clause and the LEFT OUTER JOIN use nested derived tables.  This is to emphasize that we 
do not need to specify the YEAR() function in each WHERE clause.  Each outer derived table is grouped 
by the year and then the two sets of outer derived tables are joined by the year.  Since we wanted the 
sales and hire information for only the years in which we had sales, we start with the sales information 
and then LEFT OUTER JOIN to the hiring information by year.   
This query could also be written without a nested derived table in the FROM clause and the LEFT OUTER 
JOIN: 
SELECT Sales_by_Year.SalesYear, Sales_by_Year.TotalRevenue, Hires_by_Year.NewHireCount 
FROM ( 
 SELECT YEAR(OrderDate) AS SalesYear, SUM(TotalDue) AS TotalRevenue 
 FROM Sales.SalesOrderHeader 
 GROUP BY YEAR(OrderDate) 
 ) Sales_by_Year 
LEFT OUTER JOIN ( 
 SELECT YEAR(HireDate) AS HireYear, COUNT(BusinessEntityID) AS NewHireCount 
 FROM HumanResources.Employee 
 GROUP BY YEAR(HireDate) 
 ) AS Hires_by_Year 
ON Hires_by_Year.HireYear = Sales_by_Year.SalesYear 
ORDER BY 1 
This second query is less lines but is more redundant than the first query due the repetition of the 
YEAR() function in the GROUP BY clause of each derived table.  We could have completed this query 
with even less lines in a much more cumbersome and less readable manner with only a single derived 
table: 
SELECT YEAR(SOH.OrderDate) AS SalesYear, SUM(TotalDue) AS SalesRevenue 
FROM Sales.SalesOrderHeader SOH 
LEFT OUTER JOIN ( 
 SELECT YEAR(HireDate) AS HiredYear, COUNT(*) AS NewHireCount 
 FROM HumanResources.Employee 

T-SQL Querying Guide  © The Knowlton Group, LLC  85 | P a g e    
 GROUP BY YEAR(HireDate) 
 ) AS EmployeeDetails 
ON EmployeeDetails.HiredYear = YEAR(SOH.OrderDate) 
GROUP BY YEAR(SOH.OrderDate) 
ORDER BY 1 
We are forced to group on functions and use many of them in the SELECT clause and even in the ON line 
of the LEFT OUTER JOIN.  This is by no means wrong.  It simply reduces readability and forces you to 
modify multiple locations in the query to make even a minor change. 
Spend some time going through the practice problems in the following lab section to practice using 
derived tables. 
Lab 16: Using Derived Tables 
1) Will the following query execute properly?  Why or why not? 
SELECT * 
FROM ( 
 SELECT P.BusinessEntityID, P.FirstName, P.LastName, YEAR(E.HireDate) 
 FROM HumanResources.Employee E 
 INNER JOIN Person.Person P 
 ON P.BusinessEntityID = E.BusinessEntityID 
 ) AS EmployeeDetails 
2) Will the following query execute properly? Why or why not? 
SELECT * 
FROM ( 
 SELECT TOP 100 P.BusinessEntityID, P.FirstName, P.LastName 
 FROM HumanResources.Employee E 
 INNER JOIN Person.Person P 
 ON P.BusinessEntityID = E.BusinessEntityID 
 ORDER BY 1 
 ) AS EmployeeDetails 
3) Using a derived table so that no functions will appear in the WHERE clause, find all employees 
from the HumanResources.Employee table who were hired in the year 2006 or greater and who 
were born in the year 1968 or less. 
4) Using a derived table, find the total sales revenue generated for the year 2005 and 2006.  The 
WHERE and GROUP BY clause of your SELECT statement should have no functions. 
Using Common Table Expressions 
A common table expression (hereon referred to as a CTE) is another type of table expression in SQL 
Server that has certain advantages over a derived table.  A CTE is directly queried in the FROM clause 
like the derived table; instead it is defined before the SELECT clause.  In the subsection on derived 
tables, we looked at this example query: 

T-SQL Querying Guide  © The Knowlton Group, LLC  86 | P a g e    
SELECT OrderYear, COUNT(*) AS SalesCount 
FROM ( 
 SELECT YEAR(OrderDate) AS OrderYear, SalesOrderID 
 FROM Sales.SalesOrderHeader 
 ) SalesDetails 
WHERE OrderYear = '2006' 
GROUP BY OrderYear 
We can return the same results using a CTE by executing the query: 
WITH SalesDetails  
AS ( 
 SELECT YEAR(OrderDate) AS OrderYear, SalesOrderID 
 FROM Sales.SalesOrderHeader 
) 
SELECT OrderYear, COUNT(*) AS SalesCount 
FROM SalesDetails 
WHERE OrderYear = '2006' 
GROUP BY OrderYear 
Notice that the CTE begins with “WITH” and then the name of the CTE is specified – in this case 
“SalesDetails”.  Then we type “AS” and open a parenthesis.  From there, we type the query that will 
define the CTE and close the opening parenthesis.  Next, we type our query that uses the CTE simply by 
referencing its name.  Unlike the derived table, there is a separation of sorts in that the table expression 
is not directly defined in the eventual SELECT statement.  Let’s look at another example of a query we 
completed using a derived table and then how we could return the same results with a CTE. 
In the last subsection, we created the following query using a derived table: 
SELECT * 
FROM ( 
 SELECT BusinessEntityID, NationalIDNumber, YEAR(BirthDate) AS BirthYear, 
YEAR(HireDate) AS HiredYear 
 FROM HumanResources.Employee 
 ) HR_Emp 
WHERE HiredYear >= 2004 
We could rewrite this query using a CTE with the query: 
WITH HR_Emp  
AS ( 
 SELECT BusinessEntityID, NationalIDNumber, YEAR(BirthDate) AS BirthYear, 
YEAR(HireDate) AS HiredYear 
 FROM HumanResources.Employee 
) 
SELECT * 
FROM HR_Emp 
WHERE HiredYear >= 2004 

T-SQL Querying Guide  © The Knowlton Group, LLC  87 | P a g e    
The query used to define the table expression does not change; the location in which it is defined and 
how it is called are just slightly different between CTEs and derived tables.   
CTEs possess an advantage over derived tables in that they can be used multiple times during a SELECT 
statement.  Say, for example, that we wanted to return three columns: the sales year, the total sales 
that year and the total sales the year before.  To complete this query using derived tables, we would 
have to type the same derived table twice: 
SELECT  
 SalesCurrentYear.SalesYear,  
 SalesCurrentYear.TotalSales AS AnnualSales,  
 SalesPriorYear.TotalSales AS PriorYearSales 
FROM ( 
 SELECT YEAR(OrderDate) AS SalesYear, SUM(TotalDue) AS TotalSales 
 FROM Sales.SalesOrderHeader 
 GROUP BY YEAR(OrderDate) 
 ) AS SalesCurrentYear 
LEFT OUTER JOIN ( 
 SELECT YEAR(OrderDate) AS SalesYear, SUM(TotalDue) AS TotalSales 
 FROM Sales.SalesOrderHeader 
 GROUP BY YEAR(OrderDate) 
 ) AS SalesPriorYear 
ON SalesCurrentYear.SalesYear - 1 = SalesPriorYear.SalesYear 
ORDER BY 1 
However, a CTE can be called multiple times in the same SELECT statement.  The above query could be 
reduced to: 
WITH SalesByYear 
AS ( 
 SELECT YEAR(OrderDate) AS SalesYear, SUM(TotalDue) AS TotalSales 
 FROM Sales.SalesOrderHeader 
 GROUP BY YEAR(OrderDate) 
) 
SELECT  
 CurrentYearSales.SalesYear,  
 CurrentYearSales.TotalSales AS AnnualSales,  
 PriorYearSales.TotalSales AS PriorYearSales 
FROM SalesByYear AS CurrentYearSales 
LEFT OUTER JOIN SalesByYear AS PriorYearSales 
ON CurrentYearSales.SalesYear - 1 = PriorYearSales.SalesYear 
ORDER BY 1 
Notice how the SalesByYear CTE is defined once initially but is used in both the FROM clause and the 
LEFT OUTER JOIN.  The ability to reuse the CTE in the SELECT statement is an important benefit that the 
CTE possesses over the derived table.  For many queries, this can reduce the amount of code required to 
complete a SQL statement. 
Another benefit that CTEs possess over derived tables is that a CTE can be directly nested in other CTEs.  
Let’s take a look at an example of this: 

T-SQL Querying Guide  © The Knowlton Group, LLC  88 | P a g e    
WITH S1 
AS ( 
 SELECT YEAR(OrderDate) AS SalesYear, SalesOrderID, TotalDue 
 FROM Sales.SalesOrderHeader 
),  
S2 AS ( 
 SELECT SalesYear, COUNT(SalesOrderID) AS SalesCount, SUM(TotalDue) AS AnnualSales 
 FROM S1 
 GROUP BY SalesYear 
) 
SELECT SalesYear, SalesCount, AnnualSales 
FROM S2 
WHERE SalesCount > 5000 
ORDER BY 1 
Looking carefully at the CTE “S2”, you will see that the table expression is created based on data from 
the CTE “S1”.  As complexity in your queries expands, the ability to simplify these steps will improve the 
readability of your code and allow you to make code modifications much more easily.   
CTEs and derived tables are often necessary in order to complete some SQL queries.  Becoming familiar 
with them is essential in becoming a proficient intermediate SQL developer.  Complete the following lab 
questions to reinforce your understanding of common table expressions. 
Lab 17: Common Table Expressions 
1) Will the following code execute properly?  Why or why not? 
WITH Sales  
AS ( 
 SELECT YEAR(OrderDate), TotalDue 
 FROM Sales.SalesOrderHeader 
) 
SELECT * 
FROM Sales 
2) Using CTEs, find out, for each year the company hired new employees, how many employees 
the company hired for each year and how many employees they hired the previous year.  Use 
the HumanResources.Employee table for this lab question.  You should have return three 
columns: the hire year, the number of employees hired that year, and the number of employees 
hired the year before. 
3) Using a CTE such that no functions are contained in the GROUP BY clause of the SELECT 
statement, find the number of products that began selling per year (using the 
Production.Product table and the SellStartDate column). 
4) Using a CTE that references the Sales.SalesOrderHeader table, find each month’s total sales for 
the year 2006.  No functions should be used in either the WHERE or the GROUP BY clause of 
your SELECT statement. 

T-SQL Querying Guide  © The Knowlton Group, LLC  89 | P a g e    
Section 13: CASE Statements 
CASE statements are an incredibly helpful tool in completing more advanced SELECT statements.  A 
CASE statement allows you to evaluate multiple conditions and return one possible expression based on 
the criteria you define.  Let’s take a look at an example: 
SELECT ProductID, ListPrice,  
 CASE 
    WHEN ListPrice > 100 THEN 'Expensive Product' 
    ELSE 'Inexpensive Product' 
 END AS ProductPriceClass 
FROM Production.Product 
WHERE ListPrice <> 0 
The previous query returns the ProductID and ListPrice column from the Production.Product table for all 
rows that do not have a ListPrice value equal to zero.  The third column in the result, which we call 
ProductPriceClass, is a CASE statement that, based on certain conditions, outputs a classification of the 
product’s price based on the ListPrice column value.  We are telling SQL that if the ListPrice value of the 
row exceeds 100, then return the string “Expensive Product”.  If that condition is not met, then SQL 
returns the string “Inexpensive Product”.   
CASE statements are useful when trying to classify row values into ranges.  A marketing department 
often stratifies their customer information by age ranges to understand their consumer demographics.  
Let’s apply the same concept to our AdventureWorks data set and group our customers into age ranges 
and output the results: 
WITH CustomerAges 
AS ( 
 SELECT 
    P.FirstName, P.LastName, FLOOR(DATEDIFF(DAY, PD.BirthDate, 
GETDATE())/365.25) AS Age 
 FROM Sales.vPersonDemographics PD 
 INNER JOIN Person.Person P 
 ON PD.BusinessEntityID = P.BusinessEntityID 
) 
SELECT *,  
 CASE 
    WHEN Age IS NULL THEN 'Unknown Age' 
    WHEN Age BETWEEN 0 AND 17 THEN 'Under 18' 
    WHEN Age BETWEEN 18 AND 24 THEN '18 to 24' 
    WHEN Age BETWEEN 25 AND 34 THEN '25 to 34' 
    WHEN Age BETWEEN 35 AND 49 THEN '35 to 49' 
    WHEN Age BETWEEN 50 AND 64 THEN '50 to 64' 
    ELSE 'Over 65' 
 END AS AgeRange 
FROM CustomerAges 
We are first using a CTE to reduce the number of times we have to write our Age function.  The age 
column uses the DATEDIFF() function to determine the number of days between the customer’s birth 
date and the current date.  Dividing the result by 365.25 yields a more accurate numeric value of the 

T-SQL Querying Guide  © The Knowlton Group, LLC  90 | P a g e    
customer’s true age.  We then use the FLOOR() function, a function we haven’t yet seen, which simply 
truncates the decimal and returns only the leading integer.  Then, using our CTE in the SELECT 
statement, we use a multi-condition CASE statement to define the age values that comprise each range.    
Let’s take the previous query a step further and count the number of customers that fit into each age 
range.  We will again utilize CTEs to reduce the redundancy in our code: 
WITH CustomerAges 
AS ( 
 SELECT 
    P.BusinessEntityID, 
    P.FirstName,  
    P.LastName,  
    FLOOR(DATEDIFF(DAY, PD.BirthDate, GETDATE())/365.25) AS Age 
 FROM Sales.vPersonDemographics PD 
 INNER JOIN Person.Person P 
 ON PD.BusinessEntityID = P.BusinessEntityID 
),  
CustomerAgeRanges AS ( 
 SELECT  
    CASE 
   WHEN Age IS NULL THEN 'Unknown Age' 
   WHEN Age BETWEEN 0 AND 17 THEN 'Under 18' 
   WHEN Age BETWEEN 18 AND 24 THEN '18 to 24' 
   WHEN Age BETWEEN 25 AND 34 THEN '25 to 34' 
   WHEN Age BETWEEN 35 AND 49 THEN '35 to 49' 
   WHEN Age BETWEEN 50 AND 64 THEN '50 to 64' 
   ELSE 'Over 65' 
    END AS AgeRange, BusinessEntityID 
 FROM CustomerAges 
) 
SELECT AgeRange, COUNT(*) AS Customer_Count 
FROM CustomerAgeRanges 
GROUP BY AgeRange 
ORDER BY 1 
Now, we have been able to use two CTEs so that a large CASE statement or function is not being placed 
in our GROUP BY clause.  Creating multi-step conditional outputs based on the column values is not an 
uncommon practice in more involved SQL queries.  Generally, queries are developed to return data for 
business users.  Business users often want to see their data presented in ways that does not necessarily 
match up to how the data exists in the database.  Something as simple as the Gender column in the 
Sales.vPersonDemographics view may require output modification for a business user’s satisfaction.  In 
our database, the Gender field is marked by either NULL values or an “M” or “F” value.  Our business 
users may want to see “Male” and “Female” in place of “M” and “F”.  A CASE statement would allow us 
to complete their request without issue: 
SELECT  
 PD.BusinessEntityID,  
 P.FirstName,  
 P.LastName,  
 CASE 

T-SQL Querying Guide  © The Knowlton Group, LLC  91 | P a g e    
    WHEN PD.Gender = 'M' THEN 'Male' 
    WHEN PD.Gender = 'F' THEN 'Female' 
    ELSE 'No Gender Specified' 
 END AS Gender 
FROM Sales.vPersonDemographics PD 
INNER JOIN Person.Person P 
ON P.BusinessEntityID = PD.BusinessEntityID 
In just a few short lines of code, we are able to alter the output of the Gender column to satisfy the 
requirements of the business user who requested the data. 
If you remember back to our section on NULL Handling Functions, we observed the COALESCE() 
function.  We can create a CASE statement that is functionally equivalent to the COALESCE() function: 
SELECT 
 MiddleName,  
 CASE 
    WHEN MiddleName IS NULL THEN '' 
    ELSE MiddleName 
 END AS NULL_Handled_MiddleName 
FROM Person.Person 
You can think of the COALESCE() function as a CASE statement that simply handles when the value of the 
input argument is NULL.  There are many of these similarities throughout SQL Server and trying to 
understand them will greatly improve your understanding of the querying language. 
We can also use CASE statements to cleverly order our results when we wish to use a non-standard 
ordering method.  For example, let’s say we wanted to find each customer’s historical sales with our 
company.  Then, we want to use various ranges to see how many customers fall into each historical 
sales range.  Given the names of our sales ranges, however, ordering in ascending or descending order 
does not properly order the results given the combination of numbers, symbols and text.  The query 
below will accomplish this request: 
WITH CustomerSales 
AS ( 
 SELECT  
    P.BusinessEntityID, 
    SUM(SOH.TotalDue) AS TotalSalesAmount 
 FROM Sales.SalesOrderHeader SOH 
 INNER JOIN Sales.Customer SC 
 ON SC.CustomerID = SOH.CustomerID 
 INNER JOIN Person.Person P 
 ON P.BusinessEntityID = SC.PersonID 
 GROUP BY P.BusinessEntityID 
),  
CustomerSalesRange AS ( 
 SELECT  
    P.FirstName + ' ' + P.LastName AS CustomerName,  
    CASE 
   WHEN CS.TotalSalesAmount BETWEEN 0 AND 149.99 THEN 'Under $150' 

T-SQL Querying Guide  © The Knowlton Group, LLC  92 | P a g e    
   WHEN CS.TotalSalesAmount BETWEEN 150 AND 499.99 THEN '$150 - 
$499.99' 
   WHEN CS.TotalSalesAmount BETWEEN 500 AND 999.99 THEN '$500 - 
$999.99' 
   WHEN CS.TotalSalesAmount BETWEEN 1000 AND 4999.99 THEN '$1,000 - 
$4,999.99' 
   WHEN CS.TotalSalesAmount BETWEEN 5000 AND 14999.99 THEN '$5,000 - 
$14,999.99' 
   WHEN CS.TotalSalesAmount BETWEEN 15000 AND 49999.99 THEN '$15,000 - 
$49,999.99' 
   WHEN CS.TotalSalesAmount BETWEEN 50000 AND 149999.99 THEN '$50,000 - 
$149,999.99' 
   ELSE 'Over $150,000' 
    END AS SalesRange 
 FROM CustomerSales CS 
 INNER JOIN Person.Person P 
 ON P.BusinessEntityID = CS.BusinessEntityID 
) 
SELECT SalesRange, COUNT(*) AS [Customers in Range] 
FROM CustomerSalesRange 
GROUP BY SalesRange 
ORDER BY  
 CASE 
    WHEN SalesRange = 'Under $150' THEN 1 
    WHEN SalesRange = '$150 - $499.99' THEN 2 
    WHEN SalesRange = '$500 - $999.99' THEN 3 
    WHEN SalesRange = '$1,000 - $4,999.99' THEN 4 
    WHEN SalesRange = '$5,000 - $14,999.99' THEN 5 
    WHEN SalesRange = '$15,000 - $49,999.99' THEN 6 
    WHEN SalesRange = '$50,000 - $149,999.99' THEN 7 
    WHEN SalesRange = 'Over $150,000' THEN 8 
 END 
This is a rather lengthy query, but it does not utilize any technique we are not already familiar with.  We 
first define two CTEs so that we do not need to type out the large CASE statement in 
CustomerSalesRange an unnecessary amount of times in our code.  Then we start our SELECT statement 
by grouping our results by the SalesRange column and finding the number of rows that fit into each 
range.  The unique aspect of this query is the CASE statement in the ORDER BY clause.  If you remove the 
ORDER BY clause and execute the query, you will notice the improper ordering that we discussed in the 
previous paragraph.  However, a CASE statement will resolve this ordering issue.  In this CASE statement, 
we are telling the ORDER BY clause to assign the value 1 to the first SalesRange (“Under $150”), 2 for the 
second SalesRange (“$150 - $499.99”), and so on.  The ORDER BY clause then orders not by the 
SalesRange text but by the number we assign to it in the CASE statement.  This forces the ORDER BY 
clause to order in the method we desire – the correct ordering method given the ranges.   
CASE statements can be an invaluable tool in writing successful SQL queries.  The ability to modify 
output column values based on several conditions, defining ranges, and even defining atypical ordering 
are all benefits that CASE statements offer.  Complete the following lab exercises to gain some 
additional practice. 

T-SQL Querying Guide  © The Knowlton Group, LLC  93 | P a g e    
Lab 18: CASE Statements 
1) Return the FirstName and LastName column from Person.Person.  Return a third column that 
outputs “Promotion 1” if the EmailPromotion column value is 0, “Promotion 2” if the 
EmailPromotion value is 1, and “Promotion 3” if the column value is 2. 
2) In a CASE statement you define conditional expressions based on column values.  What happens 
if a value does not meet any of the conditions expressed in the CASE statement? 
3) Using the Person.Person table, complete a CASE statement that returns the string “Long Name” 
if the FirstName column is at least ten characters long.  If the FirstName column is less than ten 
characters, return the string “Short Name”. 
4) Find out how many sales in Sales.SalesOrderHeader fell into the following ranges: “$0 to 
$149.99”, “$150 – 499.99”, “$500 to $4,999.99”, “$5,000 - $24,999.99”, and “Over $25,000”.  
Only analyze sales that had a sales person associated with them. 
5) Using the Production.Product table, if the value in Color column is NULL then return the string 
“No Color”.  If a color exists, then return that color.  Complete this request two ways: once with 
a CASE statement, and the second with a function we have previously discussed in this guide. 

T-SQL Querying Guide  © The Knowlton Group, LLC  94 | P a g e    
Section 14: Ranking Functions 
Ranking functions allow you to rank your results given a certain grouping (referred to as a partition in a 
ranking function) and in a specified order.  There are four ranking functions: ROW_NUMBER(), RANK(), 
DENSE_RANK() and NTILE().  As with each of the prior sections that involve functions, we will provide the 
syntax and description of each along with several supporting examples. 
Function: ROW_NUMBER() 
Syntax: ROW_NUMBER() OVER([PARTITION BY partition_argument] ORDER BY order_by_clause) 
Description: The ROW_NUMBER() ranking function returns the sequential number of each row within 
the context of each partition and ordered based on the order_by_clause. 
Example 1:  For each employee in HumanResources.Employee, return the BusinessEntityID, HireDate 
and then the numerical order in which each employee was hired such that the first employee hired has 
the value 1. 
SELECT  
 BusinessEntityID,  
 HireDate,  
 ROW_NUMBER() OVER(ORDER BY HireDate ASC) AS HireOrder 
FROM HumanResources.Employee 
In this example, we do not need to use the PARTITION BY component of the ROW_NUMBER() function 
since we are looking at the entire data set as one partition.  You will notice that the row with the 
HireOrder value of 1 is the employee that has the oldest HireDate just as we desired. 
Example 2: Find the SalesOrderID, CustomerID, OrderDate and a ranking function that tells us what 
order number each sale is for each customer.  So, for each customer’s oldest order, we wish to see the 
value 1 in this ranking column.  Order the results by the CustomerID and then the ranking function 
column. 
SELECT  
 SalesOrderID,  
 CustomerID,  
 OrderDate,  
 ROW_NUMBER() OVER(PARTITION BY CustomerID ORDER BY OrderDate) AS 
CustomerSalesOrder 
FROM Sales.SalesOrderHeader 
ORDER BY 2, 4 
Now you will notice we employ the PARTITION BY clause of the ROW_NUMBER() function to partition 
the set into groups based on the CustomerID.  Given the ORDER BY component within the 
ROW_NUMBER() function, the first order that a customer places is assigned a value of 1, the second a 
value of 2, and so on.   
Example 3: Using the ROW_NUMBER() function, rank each sales person by their total sales in 2006.  The 
sales person with the most sales should be given the value of 1. 

T-SQL Querying Guide  © The Knowlton Group, LLC  95 | P a g e    
WITH Sales_2006 
AS ( 
 SELECT  
    P.FirstName + ' ' + P.LastName AS SalesPerson,  
    SOH.TotalDue 
 FROM Sales.SalesOrderHeader SOH 
 INNER JOIN Sales.SalesPerson SP 
 ON SP.BusinessEntityID = SOH.SalesPersonID 
 INNER JOIN Person.Person P 
 ON P.BusinessEntityID = SP.BusinessEntityID 
 WHERE YEAR(SOH.OrderDate) = 2006 
) 
SELECT  
 SalesPerson,  
 SUM(TotalDue) AS Total_2006_Sales, 
 ROW_NUMBER() OVER(ORDER BY SUM(TotalDue) DESC) AS SalesPersonRank 
FROM Sales_2006 
GROUP BY SalesPerson 
Instead of ordering by a specific column, we had to order by the SUM(TotalDue) given that we were 
grouping our data by the SalesPerson column.  This is completed acceptable for the ROW_NUMBER() 
function.  We could have grouped our data before the ranking function with a query like:  
WITH Sales_2006 
AS ( 
 SELECT  
    P.FirstName + ' ' + P.LastName AS SalesPerson,  
    SUM(SOH.TotalDue) AS Total_2006_Sales 
 FROM Sales.SalesOrderHeader SOH 
 INNER JOIN Sales.SalesPerson SP 
 ON SP.BusinessEntityID = SOH.SalesPersonID 
 INNER JOIN Person.Person P 
 ON P.BusinessEntityID = SP.BusinessEntityID 
 WHERE YEAR(SOH.OrderDate) = 2006 
 GROUP BY P.FirstName + ' ' + P.LastName 
) 
SELECT *, ROW_NUMBER() OVER(ORDER BY Total_2006_Sales DESC) AS SalesPersonRank 
FROM Sales_2006 
It is a matter of personal preference.  The only difference is that the grouping occurs in the CTE in the 
second version of the query as opposed to occurring in the outer SELECT statement like in the first 
query.  Whichever method you choose, the ROW_NUMER() function is still properly ranking based on 
the sales person’s sales in 2006 as the query requires. 
Though it may not initially seem to be important, this query has proven to be tremendously valuable in 
my experience.  In the section on subqueries, we will see some examples where the ROW_NUMBER() 
enables us to retrieve data that might have been incredibly difficult to complete otherwise. 
Function: RANK() 
Syntax: RANK() OVER([PARTITION BY partition_argument] ORDER BY order_by_clause) 

T-SQL Querying Guide  © The Knowlton Group, LLC  96 | P a g e    
Description: Returns the rank of each row such that the rank equals one plus the number of ranks 
preceding it. If you image sports standing with two teams tied for first, the rank for those two teams 
would be the number 1.  However, the third team would not have the rank of 2 but would have the rank 
3.   
Example 1: Find the number of products in each order.  Then find which order quantity is the most 
common and rank them accordingly. 
WITH SalesOrderQuantities 
AS ( 
 SELECT SOH.SalesOrderID, SUM(SOD.OrderQty) AS QtyPerOrder 
 FROM Sales.SalesOrderHeader SOH 
 INNER JOIN Sales.SalesOrderDetail SOD 
 ON SOH.SalesOrderID = SOD.SalesOrderID 
 GROUP BY SOH.SalesOrderID 
),  
SalesOrderQuantitiesGrouped 
AS ( 
 SELECT QtyPerOrder, COUNT(*) AS Orders_with_Quantity 
 FROM SalesOrderQuantities 
 GROUP BY QtyPerOrder 
) 
SELECT  
 QtyPerOrder,  
 Orders_with_Quantity, 
 RANK() OVER(ORDER BY Orders_with_Quantity DESC) AS OrderQuantityRank 
FROM SalesOrderQuantitiesGrouped 
This query first finds out how many products were ordered with each order.  Then it finds out how many 
orders were associated with each order quantity.  Lastly, in the outer SELECT statement, the RANK() 
function supplies us with a rank for each order quantity based on how many orders had the specified 
order quantity. 
Example 2: Open and the SQL file named “NHL_Standings.sql” and execute the code if you have not 
already done so.  The generated table contains the 2014 point totals for each NHL team at the end of 
the regular season.  Using the RANK() function, find each team’s rank. 
SELECT  
 TeamName,  
 RANK() OVER(ORDER BY Points DESC) AS NHL_Rank 
FROM NHL_Standings_2014 
You will notice that both St. Louis and San Jose were tied for fourth place (we are not taking into 
account the tiebreaker rules the NHL has in place for this query).  The team after those two, Pittsburgh, 
is not the fifth ranked team but rather the sixth ranked team.  This is the main subtlety of the RANK() 
function. 
Function: DENSE_RANK() 
Syntax: DENSE_RANK() OVER([PARTITION BY partition_argument] ORDER BY order_by_clause) 

T-SQL Querying Guide  © The Knowlton Group, LLC  97 | P a g e    
Description: Returns the rank of each row in the partition such that the dense rank equals one plus the 
number of distinct ranks before the specified row.  This is slightly different than the RANK() function and 
will become clear in the examples. 
Example 1: Rank the NHL teams based on their point totals using the NHL_Standings_2014 table.  Unlike 
the previous example, use the DENSE_RANK() function instead of the RANK() function. 
SELECT  
 TeamName,  
 DENSE_RANK() OVER(ORDER BY Points DESC) AS NHL_Rank 
FROM NHL_Standings_2014 
Unlike when we used the RANK() function, Pittsburgh – the team following the two teams tied for fourth 
place – now has a dense rank value of 5 instead of 6.  Unlike the rank function which supplies a rank of a 
one plus the number of ranked rows prior to the row in question, the dense rank gives a value of one 
plus the number of distinct ranks preceding the row.  In this case, there were only four distinct dense 
ranks before Pittsburgh, hence they receive a dense rank value of 5. 
Example 2: View both the dense rank and rank of each NHL team side-by-side. 
SELECT  
 TeamName,  
 RANK() OVER(ORDER BY Points DESC) AS NHL_Rank, 
 DENSE_RANK() OVER(ORDER BY Points DESC) AS NHL_Dense_Rank 
FROM NHL_Standings_2014 
You will see that there are a few instances of these “ties”.  Both the DENSE_RANK() and RANK() 
functions handle the subsequent values after the tie differently.  Keep this difference in mind as you 
complete requests that may require a rank similar to those we have seen in the previous three 
examples. 
Function: NTILE() 
Syntax: NTILE() OVER([PARTITION BY partition_argument] ORDER BY order_by_clause) 
Description: Partitions the rows into the number of groups specified in the NTILE() argument based on 
the ordering method specified with the order_by_clause. 
Example 1: Based on each sales person’s sales in 2006, assign them to one of four tiers such that the top 
tier, tier 1, contains only the highest performing sales people. 
WITH SalesPerson_2006 
AS ( 
 SELECT 
    P.FirstName + ' ' + P.LastName AS SalesPerson,  
    SUM(SOH.TotalDue) AS TotalSales_2006 
 FROM Sales.SalesOrderHeader SOH 
 INNER JOIN Sales.SalesPerson SP 
 ON SP.BusinessEntityID = SOH.SalesPersonID 
 INNER JOIN Person.Person P 
 ON P.BusinessEntityID = SP.BusinessEntityID 
 WHERE YEAR(SOH.OrderDate) = 2006 
 GROUP BY P.FirstName + ' ' + P.LastName 

T-SQL Querying Guide  © The Knowlton Group, LLC  98 | P a g e    
) 
SELECT 
 SalesPerson,  
 NTILE(4) OVER(ORDER BY TotalSales_2006 DESC) AS SalesPersonTier 
FROM SalesPerson_2006 
Since we specified the number 4 in the NTILE() clause, the function will split the results into four groups.  
Unfortunately, since the number of rows (14) is not divisible by four, the remainder of the number of 
rows and NTILE() group argument is applied to the first groups.  Since 14 divided by 4 has a remainder of 
two, the first two groups have one more row than groups three and four.   
Example 2: Group our sales force into three groups based on their 2006 sales.  Using a CASE statement 
and the NTILE() ranking function, label the first group as “Top Tier”, second group as “Middle Tier”, and 
third group as “Bottom Tier”. 
WITH SalesPerson_2006 
AS ( 
 SELECT 
    P.FirstName + ' ' + P.LastName AS SalesPerson,  
    SUM(SOH.TotalDue) AS TotalSales_2006 
 FROM Sales.SalesOrderHeader SOH 
 INNER JOIN Sales.SalesPerson SP 
 ON SP.BusinessEntityID = SOH.SalesPersonID 
 INNER JOIN Person.Person P 
 ON P.BusinessEntityID = SP.BusinessEntityID 
 WHERE YEAR(SOH.OrderDate) = 2006 
 GROUP BY P.FirstName + ' ' + P.LastName 
) 
SELECT 
 SalesPerson,  
 CASE 
    WHEN NTILE(3) OVER(ORDER BY TotalSales_2006 DESC) = 1 THEN 'Top Tier' 
    WHEN NTILE(3) OVER(ORDER BY TotalSales_2006 DESC) = 2 THEN 'Middle Tier' 
    WHEN NTILE(3) OVER(ORDER BY TotalSales_2006 DESC) = 3 THEN 'Bottom Tier' 
 END AS SalesPersonTier 
FROM SalesPerson_2006 
In this scenario, we have embedded the NTILE() function directly into our CASE statement.  We could 
have reduced this redundancy by using a second CTE: 
WITH SalesPerson_2006 
AS ( 
 SELECT 
    P.FirstName + ' ' + P.LastName AS SalesPerson,  
    SUM(SOH.TotalDue) AS TotalSales_2006 
 FROM Sales.SalesOrderHeader SOH 
 INNER JOIN Sales.SalesPerson SP 
 ON SP.BusinessEntityID = SOH.SalesPersonID 
 INNER JOIN Person.Person P 
 ON P.BusinessEntityID = SP.BusinessEntityID 
 WHERE YEAR(SOH.OrderDate) = 2006 
 GROUP BY P.FirstName + ' ' + P.LastName 

T-SQL Querying Guide  © The Knowlton Group, LLC  99 | P a g e    
),  
SalesPerson_NTILE AS ( 
 SELECT  
    SalesPerson,  
    NTILE(3) OVER(ORDER BY TotalSales_2006 DESC) AS SalesPersonTierNumber 
 FROM SalesPerson_2006 
) 
SELECT  
 SalesPerson,  
 CASE 
    WHEN SalesPersonTierNumber = 1 THEN 'Top Tier' 
    WHEN SalesPersonTierNumber = 2 THEN 'Middle Tier' 
    WHEN SalesPersonTierNumber = 3 THEN 'Bottom Tier' 
 END AS SalesPersonTier 
FROM SalesPerson_NTILE 
Ranking functions are necessary in many analytic applications.  You may be determining which areas 
have the most traffic, the most visited branches, the most purchased products or any number of other 
analytic questions that leverages ranking functions.  Understanding the subtleties of each and where 
and when to use them is an essential skill for a SQL developer. 
Lab 19: Ranking Functions 
1) Using the ROW_NUMBER() function, rank each customer by the amount they purchased in the 
year 2005. 
2) Using the RANK() function, rank each product by how many times it was ordered in the year 
2006. 
3) Complete the same query as the previous lab question except use the DENSE_RANK() function 
instead of the RANK() function.  Find any differences between the two ranking methods. 
4) Find the total sales for each territory in the year 2006.  Using the NTILE() function, split them 
into two groups. 
5) Expanding on question four, complete a CASE statement that takes the result of the NTILE() 
function and assigns the value “Top Territory Group” when the NTILE() value is 1 and “Bottom 
Territory Group” when the NTILE() value is 2. 

T-SQL Querying Guide  © The Knowlton Group, LLC  100 | P a g e    
Section 15: Set Operations 
Set operations use specific operators to combine multiple queries into a single result set.  This is 
different from a join operation where columns from both tables are presented into a single 
representation.  Set operations simply combine results from separate queries into a single result. 
There are three major set operations: UNION, EXCEPT, and INTERSECT.  However, there are keywords 
that can be used for each set operations to alter the result.   
Set Operation: UNION 
Description: Combines results from multiple queries into one result such that the results contains all 
rows from both queries. 
Venn Diagram: 
As you can see by the Venn diagram, both sets are shaded (equate a set to either a table or query) 
indicating that the results of the union of the sets contains all data from both sets. 
Example 1: Find all states in which our customers reside.  Find all states in which our employees reside.  
Union the two queries such that the states where our employees reside and the states where our 
customers reside both appear in the results. 
SELECT SP.Name AS StateProvinceName 
FROM HumanResources.Employee E 
INNER JOIN Person.BusinessEntityAddress BEA 
ON BEA.BusinessEntityID = E.BusinessEntityID 
INNER JOIN Person.[Address] PA 
ON PA.AddressID = BEA.AddressID 
INNER JOIN Person.StateProvince SP 
ON SP.StateProvinceID = PA.StateProvinceID 
UNION 
SELECT SP.Name AS StateProvinceName 
FROM Sales.Customer SC 
INNER JOIN Person.Person P 
ON P.BusinessEntityID = SC.PersonID 
INNER JOIN Person.BusinessEntityAddress BEA 

T-SQL Querying Guide  © The Knowlton Group, LLC  101 | P a g e    
ON BEA.BusinessEntityID = P.BusinessEntityID 
INNER JOIN Person.[Address] PA 
ON PA.AddressID = BEA.AddressID 
INNER JOIN Person.StateProvince SP 
ON SP.StateProvinceID = PA.StateProvinceID 
The top query identifies all states where our employees reside, while the bottom query identifies all 
states where our customers reside.  The union of these two sets takes the distinct list of states from 
both queries.  By default, the UNION set operation only takes the distinct combination of both sets.  For 
example, Washington appears many times in the top query, but it appears only once in the results. 
The keyword ALL can be placed after the UNION operation and the result will be dramatically different: 
SELECT SP.Name AS StateProvinceName 
FROM HumanResources.Employee E 
INNER JOIN Person.BusinessEntityAddress BEA 
ON BEA.BusinessEntityID = E.BusinessEntityID 
INNER JOIN Person.[Address] PA 
ON PA.AddressID = BEA.AddressID 
INNER JOIN Person.StateProvince SP 
ON SP.StateProvinceID = PA.StateProvinceID 
UNION ALL 
SELECT SP.Name AS StateProvinceName 
FROM Sales.Customer SC 
INNER JOIN Person.Person P 
ON P.BusinessEntityID = SC.PersonID 
INNER JOIN Person.BusinessEntityAddress BEA 
ON BEA.BusinessEntityID = P.BusinessEntityID 
INNER JOIN Person.[Address] PA 
ON PA.AddressID = BEA.AddressID 
INNER JOIN Person.StateProvince SP 
ON SP.StateProvinceID = PA.StateProvinceID 
Unlike a standard UNION set operation that returns the distinct combined list from both SELECT 
statement, the UNION ALL set operation returns every row regardless of whether or not it is a duplicate 
value. 
Set Operation: INTERSECT 
Description: Returns the values from two queries that are present in both sets. 
Venn Diagram:  

T-SQL Querying Guide  © The Knowlton Group, LLC  102 | P a g e    
The sets A and B represents two tables or queries.  The INTERSECT set operation only returns the 
intersection of those two sets; that is, only values that are present in both sets. 
Example: Find the states in which both customers and employees reside. 
SELECT SP.Name AS StateProvinceName 
FROM HumanResources.Employee E 
INNER JOIN Person.BusinessEntityAddress BEA 
ON BEA.BusinessEntityID = E.BusinessEntityID 
INNER JOIN Person.[Address] PA 
ON PA.AddressID = BEA.AddressID 
INNER JOIN Person.StateProvince SP 
ON SP.StateProvinceID = PA.StateProvinceID 
INTERSECT 
SELECT SP.Name AS StateProvinceName 
FROM Sales.Customer SC 
INNER JOIN Person.Person P 
ON P.BusinessEntityID = SC.PersonID 
INNER JOIN Person.BusinessEntityAddress BEA 
ON BEA.BusinessEntityID = P.BusinessEntityID 
INNER JOIN Person.[Address] PA 
ON PA.AddressID = BEA.AddressID 
INNER JOIN Person.StateProvince SP 
ON SP.StateProvinceID = PA.StateProvinceID 
The individual queries remain the same as in the previous examples using the UNION and UNION ALL set 
operation.  However, the result set only contains the distinct list of values present in both queries.  
Unlike with the UNION operation, there is no INTERSECT ALL set operation.   
Set Operation: EXCEPT 
Description: Returns all values from set A except for those values that are also present in set B (see 
Venn diagram for visualization). 
Venn Diagram:  

T-SQL Querying Guide  © The Knowlton Group, LLC  103 | P a g e    
You may see the EXCEPT set operation expressed as “A minus B” occasionally. 
Example: Find the states in which our employees reside and no customers reside. 
SELECT SP.Name AS StateProvinceName 
FROM HumanResources.Employee E 
INNER JOIN Person.BusinessEntityAddress BEA 
ON BEA.BusinessEntityID = E.BusinessEntityID 
INNER JOIN Person.[Address] PA 
ON PA.AddressID = BEA.AddressID 
INNER JOIN Person.StateProvince SP 
ON SP.StateProvinceID = PA.StateProvinceID 
EXCEPT 
SELECT SP.Name AS StateProvinceName 
FROM Sales.Customer SC 
INNER JOIN Person.Person P 
ON P.BusinessEntityID = SC.PersonID 
INNER JOIN Person.BusinessEntityAddress BEA 
ON BEA.BusinessEntityID = P.BusinessEntityID 
INNER JOIN Person.[Address] PA 
ON PA.AddressID = BEA.AddressID 
INNER JOIN Person.StateProvince SP 
ON SP.StateProvinceID = PA.StateProvinceID 
The three states returned are the only locations where employees reside and customers do not reside.  
States like Washington, for example, have a presence of both employees and customers.  Therefore, 
based on the EXCEPT operation, Washington would not be included in the result set. 
The way in which this database was designed does not offer many great real-world examples of using 
set operations.  However, the real life labs section will contain many practical scenarios in which a set 
operation might be an effective solution or the only solution. 
Lab 20: Set Operations 
1) Union the following queries together: all products from Production.Product with a ProductID 
value less than 325 and all products from Production.Product with a ProductID greater than or 
equal to 325. 

T-SQL Querying Guide  © The Knowlton Group, LLC  104 | P a g e    
2) Union the following queries together: all products from Production.Product with a ProductID 
value less than 425 and all products from Production.Product with a ProductID greater than or 
equal to 325. 
3) Modify the query from question 2 so that you use a UNION ALL set operation instead of a 
UNION set operation. Are the results different?  Why or why not? 
4) Find the resulting intersection of the following two queries: all products from 
Production.Product with a ProductID value less than 425 and all products from 
Production.Product with a ProductID greater than or equal to 325. 
5) Use the EXCEPT set operation between the following two queries: all products from 
Production.Product with a ProductID value less than 425 and all products from 
Production.Product with a ProductID greater than or equal to 325.  What is the result?  Why? 
6) Use the EXCEPT set operation between the following two queries: all products from 
Production.Product with a ProductID value greater than or equal to 325 and all products from 
Production.Product with a  ProductID value less than 425.  How are the results different from 
the results found in question 5?  Why? 

T-SQL Querying Guide  © The Knowlton Group, LLC  105 | P a g e    
Section 16: Subqueries 
Inline Subqueries 
A subquery is a query that is nested inside of another query.  You can think of a subquery in a similar 
fashion to how you think of nested functions.  The only difference is that instead of nesting functions 
within other functions, you will be nesting queries inside other queries.  Subqueries can be a very useful 
technique to employ when completing SQL statements.  Suppose, for example, that we wanted to find 
the row from Sales.SalesOrderHeader with the largest sale price.  Using our existing techniques, the best 
we could do would be: 
SELECT TOP 1 SalesOrderID, SalesOrderNumber, TotalDue 
FROM Sales.SalesOrderHeader 
ORDER BY TotalDue DESC 
Using a subquery, however, we can complete the request using a scalar inline subquery: 
SELECT TOP 1 SalesOrderID, SalesOrderNumber, TotalDue 
FROM Sales.SalesOrderHeader 
WHERE TotalDue = (SELECT MAX(TotalDue) FROM Sales.SalesOrderHeader) 
This is known as a scalar inline subquery because the subquery returns only a single value result and it 
has no dependency on any of the other clauses in the SELECT statement.  Subqueries that do have 
dependencies to the outer query are called correlated subqueries and will be discussed in the second 
part of this section. 
Breaking down the subquery in the WHERE clause, we first are specifying that we want the TotalDue 
column to be equal to the subquery result.  In the subquery, we are writing a SELECT statement that 
returns the largest TotalDue value.  Take a second and execute the subquery by itself.  You will see that 
only a single value is returned.  When SQL is evaluating this query, it first evaluates the subquery and 
then uses whatever the resulting value is as part of the evaluation for the outer query.  In this instance, 
SQL executes the subquery and returns the value 187487.825.  After the subquery has been evaluated, 
SQL is essentially evaluating the query: 
SELECT TOP 1 SalesOrderID, SalesOrderNumber, TotalDue 
FROM Sales.SalesOrderHeader 
WHERE TotalDue = 187487.825 
The only difference is that we are not hard-coding the largest sale price value into the WHERE clause; 
instead, we are leaving it dynamic to let SQL find the largest value for us.   
Suppose we wanted to find all employees who belonged to the engineering department.  We could 
complete this request using several joins or we could complete the request using a subquery: 
SELECT * 
FROM HumanResources.Employee 

T-SQL Querying Guide  © The Knowlton Group, LLC  106 | P a g e    
WHERE BusinessEntityID IN ( 
 SELECT ED.BusinessEntityID 
 FROM HumanResources.EmployeeDepartmentHistory ED 
 LEFT OUTER JOIN HumanResources.Department D 
 ON D.DepartmentID = ED.DepartmentID 
 LEFT OUTER JOIN HumanResources.Employee E 
 ON E.BusinessEntityID = ED.BusinessEntityID 
 WHERE ED.EndDate IS NULL 
    AND D.Name = 'Engineering' 
 ) 
Our subquery identifies all BusinessEntityIDs that are associated with the engineering department.  
Once the subquery returns the list of BusinessEntityIDs, the outer query that uses those results as the 
values for the WHERE clause with the IN operator.  It is important to note the use of the IN operator 
instead of the equals operator.  Since the result of our subquery contains multiple results (i.e. is 
multivalued and non-scalar), an operator that accepts multiple values is required.  Had we used an equal 
to operator in the previous query, we would have returned an error: 
SELECT * 
FROM HumanResources.Employee 
WHERE BusinessEntityID = ( 
 SELECT ED.BusinessEntityID 
 FROM HumanResources.EmployeeDepartmentHistory ED 
 LEFT OUTER JOIN HumanResources.Department D 
 ON D.DepartmentID = ED.DepartmentID 
 LEFT OUTER JOIN HumanResources.Employee E 
 ON E.BusinessEntityID = ED.BusinessEntityID 
 WHERE ED.EndDate IS NULL 
    AND D.Name = 'Engineering' 
 ) 
Error Message: 
Msg 512, Level 16, State 1, Line 1 
Subquery returned more than 1 value. This is not permitted when the subquery follows 
=, !=, <, <= , >, >= or when the subquery is used as an expression. 
This is a pretty explicit error message in that we know exactly where we went wrong.  SQL sees the 
equals sign and expects a scalar value to compare.  However, when SQL sees a list of BusinessEntityIDs 
in the result of the subquery, an error is generated since the equal to operator only can compare against 
a scalar value.  The error message makes it clear that using the equal to operator (or several others) is 
not compatible with a multi-valued subquery result. 
Suppose we wanted to find the most recently placed orders.  In other words, we wish to find all orders 
that were placed on the most recent day that orders were placed.  To complete this query, we would 
execute: 
SELECT * 
FROM Sales.SalesOrderHeader 
WHERE OrderDate = (SELECT MAX(OrderDate) FROM Sales.SalesOrderHeader) 

T-SQL Querying Guide  © The Knowlton Group, LLC  107 | P a g e    
Unlike the first example we looked at, this query returns multiple rows despite using the equal to 
operator and returning a scalar value in the subquery.  This is completely acceptable.  Since the 
subquery returns a single value, the WHERE clause evaluates properly.  There could be millions of rows 
that contained the most recent order date in the OrderDate column, and that still would not be an issue.  
Just because we return a single value in the subquery does not mean we return a single row in the outer 
query.   
Now suppose we wanted to find how many orders had a sales price greater than the average value of all 
orders that were placed.  A subquery will allow us to complete this request: 
SELECT COUNT(*) 
FROM Sales.SalesOrderHeader 
WHERE TotalDue > ( 
 SELECT AVG(TotalDue) 
 FROM Sales.SalesOrderHeader 
 ) 
In the subquery, we are first identifying what the average sale price is.  Then, the outer query counts all 
rows where the TotalDue column value exceeds the returned value from the subquery.   
There is another subtlety to be aware of when completing inline subqueries.  Take the prior example of 
trying to find all employees associated with the engineering department.  Suppose we modified the 
subquery a bit and execute the code: 
SELECT * 
FROM HumanResources.Employee 
WHERE BusinessEntityID IN ( 
 SELECT ED.BusinessEntityID, D.Name AS DepartmentName 
 FROM HumanResources.EmployeeDepartmentHistory ED 
 LEFT OUTER JOIN HumanResources.Department D 
 ON D.DepartmentID = ED.DepartmentID 
 LEFT OUTER JOIN HumanResources.Employee E 
 ON E.BusinessEntityID = ED.BusinessEntityID 
 WHERE ED.EndDate IS NULL 
    AND D.Name = 'Engineering' 
 ) 
Now, the department name is a column included in the subquery along with the BusinessEntityID value.  
This returns the error message: 
Msg 116, Level 16, State 1, Line 12 
Only one expression can be specified in the select list when the subquery is not introduced 
with EXISTS. 
Since you are using the IN operator in the WHERE clause, SQL is expecting a single column and 
potentially multi-value result.  However, this subquery returns a table valued result since there are 
multiple columns contained in the subquery result set.  This can be a frequent mistake that causes 
errors to appear in your code.  Be cautious not to return multiple columns from your subquery unless 
you absolutely must.  You may have noticed in the error message that the exception to this rule of 
multiple columns in a subquery result set is the EXISTS operator.  We will not discuss the EXISTS 

T-SQL Querying Guide  © The Knowlton Group, LLC  108 | P a g e    
operator in this guide, but there is good reference material at http://msdn.microsoft.com/en-
us/library/ms188336(v=sql.105).aspx. 
If you consider that you are dynamically identifying what values we wish to filter on as opposed to hard 
coding them into your query, then a basic inline subquery is not all that more complex than a standard 
WHERE clause filter.  To reinforce this concept, complete the lab questions below. 
Lab 21: Inline Subquery Practice 
1) Using a subquery, find all rows from Sales.SalesOrderDetail where the ProductID value is in the 
list of all ProductIDs that have the Color “Black” from Production.Product. 
2) Using a subquery, find all employees who have an address based in California.  (Hint: your 
subquery will require multiple joins). 
3) Find all sales for the year 2006 that have a sale price less than the average sale price from our 
entire sales history. 
4) Using a subquery, find all products from Production.Product whose ProductSubcategoryID falls 
in the list of all subcategories whose name ends with “Bikes”. (Hint: you will use the LIKE 
operator in the subquery to handle this request). 
5) Find all sales from Sales.SalesOrderHeader that were ordered on the most recent OrderDate and 
had a TotalDue value less than the average TotalDue for all sales. 
Correlated Subqueries 
Correlated subqueries, unlike inline subqueries, are dependent on the outer query.  Correlated 
subqueries reference columns from the outer query and provide some dynamic options for developing 
queries. 
For our first correlated subquery example, suppose that you wished to find the orders with the most 
recent order date but for each employee.  To complete this request, execute the following query: 
SELECT SalesPersonID, SalesOrderID, OrderDate, TotalDue 
FROM Sales.SalesOrderHeader S1 
WHERE OrderDate = ( 
 SELECT MAX(OrderDate) 
 FROM Sales.SalesOrderHeader S2 
 WHERE S1.SalesPersonID = S2.SalesPersonID 
 ) 
The subquery finds the MAX(OrderDate) but does so for each employee.  Instead of a GROUP BY clause, 
we match the SalesPersonID from the subquery to the SalesPersonID in the outer query (specified by the 
“S1.SalesPersonID = S2.SalesPersonID” in the WHERE clause).  You could have completed the same 
query using a slightly more involved query: 

T-SQL Querying Guide  © The Knowlton Group, LLC  109 | P a g e    
SELECT S1.SalesPersonID, S1.SalesOrderID, S1.OrderDate, S1.TotalDue 
FROM Sales.SalesOrderHeader S1 
INNER JOIN ( 
 SELECT SalesPersonID, MAX(OrderDate) AS OrderDate 
 FROM Sales.SalesOrderHeader 
 GROUP BY SalesPersonID 
 ) S2 
ON S2.SalesPersonID = S1.SalesPersonID AND S2.OrderDate = S1.OrderDate 
WHERE S1.SalesPersonID IS NOT NULL 
In the second method we have to use a self-joining derived table that groups each sales person and 
identifies the max order date.  Then we join to the initial Sales.SalesOrderHeader table matching on 
both the SalesPersonID and the OrderDate columns to return only those orders that had both the 
proper SalesPersonID and where the OrderDate matched to the maximum OrderDate found in the 
derived table.  Additionally, we specify that the SalesPersonID cannot be NULL to ensure that no 
returned rows have a NULL SalesPersonID.  The correlated subquery allows us to complete the query in 
a cleaner fashion than the slightly more convoluted derived table self-joining method. 
Suppose we wanted to use a correlated subquery to find all employees who received a bonus of 
$5,000.00.  To complete this request with a correlated subquery, we could execute: 
SELECT P.FirstName, P.LastName 
FROM HumanResources.Employee E 
INNER JOIN Person.Person P 
ON P.BusinessEntityID = E.BusinessEntityID 
WHERE 5000 = ( 
 SELECT Bonus 
 FROM Sales.SalesPerson SP 
 WHERE SP.BusinessEntityID = E.BusinessEntityID 
 ) 
Here, we start with our list of employees and then, matching on the BusinessEntityID, identify all those 
employees who had a bonus of $5,000.00.  You could have completed the request using: 
SELECT P.FirstName, P.LastName 
FROM HumanResources.Employee E 
INNER JOIN Person.Person P 
ON P.BusinessEntityID = E.BusinessEntityID 
INNER JOIN Sales.SalesPerson SP 
ON SP.BusinessEntityID = E.BusinessEntityID 
WHERE SP.Bonus = 5000 
The previous example of identifying employees who earned a certain bonus is probably best completed 
using joins as the previous query demonstrated.  However, for the purposes of understanding correlated 
subqueries, it is important to understand how these queries can be used.  When completing 
complicated UPDATE or DELETE statements (these will not be discussed in this guide), correlated 
subqueries offer an effective solution to efficiently targeting rows that you wish to update or delete. 
Complete the following few lab questions to gain a little experience in writing correlated subqueries. 

T-SQL Querying Guide  © The Knowlton Group, LLC  110 | P a g e    
Lab 22: Using Correlated Subqueries 
1) Find the orders with the greatest TotalDue for each sales person using a correlated subquery. 
2) Find all sales people who had a bonus of zero dollars using a correlated subquery. 
3) Using a correlated subquery, find all products who had a list price of $539.99. 

T-SQL Querying Guide  © The Knowlton Group, LLC  111 | P a g e    
Section 17: Advanced Aggregations and Pivoting 
Suppose we wanted to find out what are total sales revenue was by month throughout our entire sales 
history.  You might be tempted to write a query like: 
SELECT MONTH(OrderDate) AS OrderMonth, SUM(TotalDue) AS TotalSales 
FROM Sales.SalesOrderHeader 
GROUP BY MONTH(OrderDate) 
ORDER BY 1 
But the previous query will only give you what you historically sold in each month, but not broken out by 
the year in which that month occurred.  You could add another column for the order year and group by 
that column: 
SELECT 
 MONTH(OrderDate) AS OrderMonth,  
 YEAR(OrderDate) AS OrderYear,   
 SUM(TotalDue) AS TotalSales 
FROM Sales.SalesOrderHeader 
GROUP BY MONTH(OrderDate), YEAR(OrderDate) 
ORDER BY 2,1 
But this is not exactly the cleanest result either.  However, by employing some of the Date and Time 
Functions we learned about earlier in this guide, we can cleverly create a nice summary table: 
WITH MonthlyOrderTotals 
AS ( 
 SELECT DATEDIFF(MONTH, 0, OrderDate) AS OrderMonth, SUM(TotalDue) AS MonthlySales 
 FROM Sales.SalesOrderHeader 
 GROUP BY DATEDIFF(MONTH, 0, OrderDate) 
) 
SELECT  
 CAST(DATEADD(MONTH, OrderMonth, 0) AS DATE) AS OrderMonth, 
 MonthlySales 
FROM MonthlyOrderTotals 
ORDER BY 1 
Let’s break this query down step-by-step.  First, we define a CTE before our outer SELECT statement.  In 
this CTE we use the DATEDIFF() function to find the number of months that have passed between the 0 
day (January 1, 1900) and the OrderDate column value.  This gives us an integer value to define each 
month as opposed to having the date stored as a date value.  Since there are many days within each 
month, identify the month using this method accounts for all possible days within the month in an 
integer format.  We then take the SUM(TotalDue) and group the result set by the month integer we 
determined via the DATEDIFF() function.  In the outer SELECT statement, we then convert the integer 
date value back to the date format with the DATEADD() function and order the results by this date 
value.  By using the DATEADD() function with the zero day as the starting date and the OrderMonth 
value (which represents the number of months between the zero day and the OrderDate), we return the 
first of each month throughout our sales history.  Unlike our first two attempts at this request, we have 

T-SQL Querying Guide  © The Knowlton Group, LLC  112 | P a g e    
retrieve each month’s sales information in a clean format.  If you would like, we can employ a CASE 
statement to alter the output format of the date value: 
WITH MonthlyOrderTotals 
AS ( 
 SELECT DATEDIFF(MONTH, 0, OrderDate) AS OrderMonth, SUM(TotalDue) AS MonthlySales 
 FROM Sales.SalesOrderHeader 
 GROUP BY DATEDIFF(MONTH, 0, OrderDate) 
),  
MonthlyOrderTotals2 AS ( 
 SELECT  
    CAST(DATEADD(Month, OrderMonth, 0) AS Date) AS OrderMonth,  
    MONTH(DATEADD(Month, OrderMonth, 0)) AS OrderMonthInteger,  
    CAST(YEAR(DATEADD(Month, OrderMonth, 0)) AS VARCHAR) AS OrderYear, 
    MonthlySales 
 FROM MonthlyOrderTotals 
) 
SELECT 
 CASE 
    WHEN OrderMonthInteger = 1 THEN 'January - ' + OrderYear 
    WHEN OrderMonthInteger = 2 THEN 'February - ' + OrderYear 
    WHEN OrderMonthInteger = 3 THEN 'March - ' + OrderYear 
    WHEN OrderMonthInteger = 4 THEN 'April - ' + OrderYear 
    WHEN OrderMonthInteger = 5 THEN 'May - ' + OrderYear 
    WHEN OrderMonthInteger = 6 THEN 'June - ' + OrderYear 
    WHEN OrderMonthInteger = 7 THEN 'July - ' + OrderYear 
    WHEN OrderMonthInteger = 8 THEN 'August - ' + OrderYear 
    WHEN OrderMonthInteger = 9 THEN 'September - ' + OrderYear 
    WHEN OrderMonthInteger = 10 THEN 'October - ' + OrderYear 
    WHEN OrderMonthInteger = 11 THEN 'November - ' + OrderYear 
    WHEN OrderMonthInteger = 12 THEN 'December - ' + OrderYear 
 END AS MonthValue,  
 MonthlySales 
FROM MonthlyOrderTotals2 
ORDER BY OrderYear, OrderMonthInteger 
Unfortunately, this added a little bit of extra code to our query, but we return a much more business-
user friendly output.  Reporting out of transactional database systems tend to require some of these 
additional modifications to the output.  Data warehouses and data marts, on the other hand, are 
designed with reporting in mind.  Therefore, when you work in a data warehouse environment, there 
should be columns that cleanly store the output format that we worked so hard to create on the fly. 
Running totals are another type of aggregation method that can be requested occasionally.  There is no 
easy function that is built-in to SQL Server to handle such a request.  However, we have the expertise at 
this point to be able to complete such a request.  Suppose we want to get a running tally, by month, of 
the number of orders that were placed and the total order amount.  We could modify some the code in 
our previous example to complete this request: 
WITH MonthlyOrderTotals 
AS ( 
 SELECT  
    DATEDIFF(MONTH, 0, OrderDate) AS OrderMonth,  

T-SQL Querying Guide  © The Knowlton Group, LLC  113 | P a g e    
    SUM(TotalDue) AS MonthlySalesAmount,  
    COUNT(*) AS MonthlySalesCount 
 FROM Sales.SalesOrderHeader 
 GROUP BY DATEDIFF(MONTH, 0, OrderDate) 
) 
SELECT  
 CAST(DATEADD(MONTH, M.OrderMonth, 0) AS DATE) AS OrderMonth,  
 M.MonthlySalesAmount AS CurrentMonthSales, 
 SUM(M1.MonthlySalesAmount) AS RunningTotalSalesAmount, 
 M.MonthlySalesCount AS CurrentMonthSalesCount,  
 SUM(M1.MonthlySalesCount) AS RunningTotalSalesCount 
FROM MonthlyOrderTotals M 
INNER JOIN MonthlyOrderTotals M1 
ON M1.OrderMonth <= M.OrderMonth 
GROUP BY  
 CAST(DATEADD(MONTH, M.OrderMonth, 0) AS DATE),  
 M.MonthlySalesAmount,  
 M.MonthlySalesCount 
ORDER BY 1 
We use our existing CTE (with the addition of the sales count as determined by the COUNT() function), 
we modify our outer query to add a self-join.  This self-join matches each month to itself and each 
month to all months prior to itself.  What this does is create an ungrouped result set where each month 
has as many rows as it has months less than or equal to itself in the CTE.  By then summing the sales 
count and amount values for each month and grouping by the already grouped current month values, 
we create the running total.  If you don’t believe me, use the query earlier in the section to retrieve each 
month’s sales amounts, modify it to include the count, and then paste the results into Excel and create 
your own running calculations.  You will see that the values match up perfectly. 
To understand more about what is going on, execute this query: 
WITH MonthlyOrderTotals 
AS ( 
 SELECT  
    DATEDIFF(MONTH, 0, OrderDate) AS OrderMonth,  
    SUM(TotalDue) AS MonthlySalesAmount,  
    COUNT(*) AS MonthlySalesCount 
 FROM Sales.SalesOrderHeader 
 GROUP BY DATEDIFF(MONTH, 0, OrderDate) 
) 
SELECT  
 CAST(DATEADD(MONTH, M.OrderMonth, 0) AS DATE) AS OrderMonth,  
 M.MonthlySalesAmount AS CurrentMonthSales, 
 M1.MonthlySalesAmount AS M1SalesAmount, 
 M.MonthlySalesCount AS CurrentMonthSalesCount,  
 M1.MonthlySalesCount AS M1SalesCount 
FROM MonthlyOrderTotals M 
INNER JOIN MonthlyOrderTotals M1 
ON M1.OrderMonth <= M.OrderMonth 
ORDER BY 1 

T-SQL Querying Guide  © The Knowlton Group, LLC  114 | P a g e    
This is the ungrouped version of the query that solved our request.  If you notice, each month has as 
many rows as there are months equal to or before it.  So, September 2005 has three rows in this 
ungrouped result set because there are three months less than or equal to it in our MonthlyOrderTotals 
CTE: July 2005, August 2005, and September 2005.  Each row matches up to one of the corresponding 
sales totals from one of the prior months.  By grouping and summing, we simply are adding up each of 
the prior month and the current month’s sales to return the running value.  It is by no means expected 
that you would have solved this on your own initially.  But understanding how SQL Server evaluates 
queries and what tools you have at your disposal allows you to cleverly find solutions to requests that 
you may not have thought were possible. 
Instead of finding the cumulative running total for our sales amount and quantity, let’s make a little bit 
more challenging and find the year-to-date (YTD) totals.  Completing this request requires just a slight 
change to our previous cumulative running total query: 
WITH MonthlyOrderTotals 
AS ( 
 SELECT  
    DATEDIFF(MONTH, 0, OrderDate) AS OrderMonth,  
    SUM(TotalDue) AS MonthlySalesAmount,  
    COUNT(*) AS MonthlySalesCount 
 FROM Sales.SalesOrderHeader 
 GROUP BY DATEDIFF(MONTH, 0, OrderDate) 
) 
SELECT  
 DATEADD(MONTH, M1.OrderMonth, 0) AS OrderMonth_Date,  
 M1.MonthlySalesAmount,  
 SUM(M2.MonthlySalesAmount) AS YTD_Sales_Amount, 
 M1.MonthlySalesCount, 
 SUM(M2.MonthlySalesCount) AS YTD_Sales_Count 
FROM MonthlyOrderTotals M1 
INNER JOIN MonthlyOrderTotals M2 
ON M2.OrderMonth <= M1.OrderMonth 
 AND YEAR(DATEADD(MONTH, M2.OrderMonth, 0)) =  
    YEAR(DATEADD(MONTH, M1.OrderMonth, 0)) 
GROUP BY  
 DATEADD(MONTH, M1.OrderMonth, 0),  
 M1.MonthlySalesAmount,  
 M1.MonthlySalesCount 
ORDER BY 1 
For this request, we must add a second filtering criteria to the WHERE clause.  In the cumulative running 
total example, we needed to add up the values for all months that were either equal to or less than the 
current month.  For the YTD request, we need to do this but only for the months that are equal to or less 
than the current month and are in the same year.  A YTD calculation for Feburary 2006 will certainly not 
include the sales totals for November 2005.  We add an additional filtering criteria that first converts 
each OrderMonth – which if you remember is just an integer value for the start of each month – to a 
date value and then capture the year value from the date.  We do this for each table’s OrderMonth 
value and set the comparison operator to the equals sign.  Now, we are telling SQL to sum up the sales 
values for the current month’s YTD total based on all of the months equal to or less than it and in the 

T-SQL Querying Guide  © The Knowlton Group, LLC  115 | P a g e    
same calendar year.  This slight modification to the query gives us exactly what we are looking for.  Run 
the ungrouped query below to see the result pre-grouping and to test the results in Excel if you would 
like: 
WITH MonthlyOrderTotals 
AS ( 
 SELECT  
    DATEDIFF(MONTH, 0, OrderDate) AS OrderMonth,  
    SUM(TotalDue) AS MonthlySalesAmount,  
    COUNT(*) AS MonthlySalesCount 
 FROM Sales.SalesOrderHeader 
 GROUP BY DATEDIFF(MONTH, 0, OrderDate) 
) 
SELECT  
 DATEADD(MONTH, M1.OrderMonth, 0) AS OrderMonth_Date,  
 M1.MonthlySalesAmount,  
 M2.MonthlySalesAmount AS YTD_Sales_Amount, 
 M1.MonthlySalesCount, 
 M2.MonthlySalesCount AS YTD_Sales_Count 
FROM MonthlyOrderTotals M1 
INNER JOIN MonthlyOrderTotals M2 
ON M2.OrderMonth <= M1.OrderMonth 
 AND YEAR(DATEADD(MONTH, M2.OrderMonth, 0)) =  
    YEAR(DATEADD(MONTH, M1.OrderMonth, 0)) 
ORDER BY 1 
February of 2006 only has two rows in the ungrouped result set – as it should.  The YTD calculation for 
that month should only count its sales values and the sales values for January 2006.  Quickly looking at 
the results, you will see the query is doing exactly that.   
If you remember back to the section on aggregate functions, there was no listed function for finding the 
median of a set of values.  SQL Server does not give provide us with a simple solution, however, we can 
determine the median of a set of values through some code.  Say, for example, that we wanted to find 
the median TotalDue for our entire sales history in Sales.SalesOrderHeader.  To complete this request, 
we would execute the query: 
WITH TopHalf 
AS ( 
 SELECT TOP 50 PERCENT TotalDue 
 FROM Sales.SalesOrderHeader 
 ORDER BY TotalDue 
),  
BottomHalf AS ( 
 SELECT TOP 50 PERCENT TotalDue 
 FROM Sales.SalesOrderHeader 
 ORDER BY TotalDue DESC 
) 
SELECT ((SELECT MAX(TotalDue) FROM TopHalf) +  
 (SELECT MIN(TotalDue) FROM BottomHalf))/2 

T-SQL Querying Guide  © The Knowlton Group, LLC  116 | P a g e    
The first CTE identifies what the top 50 percent of all orders are in ascending order.  The second CTE 
identifies the top 50 percent of all orders in descending order based on the TotalDue value.  Then, the 
outer query utilizes two subqueries to return the largest value from the TopHalf CTE and another to 
return the smallest value from the BottomHalf CTE.  This gives us the smallest value of the largest values 
and the largest values from the smallest values.  This process identifies the two middle values.  Since 
these values could be different, we add the two values together and then divide by two – per proper 
median calculation.  Let’s complete another median request.  Let’s use a similar methodology to identify 
what the median ListPrice is for all products in the Production.Product table that have a ListPrice value 
greater than $0.00. 
WITH TopHalf 
AS ( 
 SELECT TOP 50 PERCENT ListPrice 
 FROM Production.Product 
 WHERE ListPrice <> 0 
 ORDER BY ListPrice 
),  
BottomHalf AS ( 
 SELECT TOP 50 PERCENT ListPrice 
 FROM Production.Product 
 WHERE ListPrice > 0 
 ORDER BY ListPrice DESC 
) 
SELECT ((SELECT MAX(ListPrice) FROM TopHalf) +  
 (SELECT MIN(ListPrice) FROM BottomHalf))/2 
Again, we find the smallest fifty percent and the largest fifty percent of the ListPrice values.  Then by 
taking the largest of the smallest fifty percent and the smallest of the largest fifty percent, we can 
complete a proper median calculation and divide the values by two.  This accurately gives us the median 
of this list which can be verified by ordering and exporting all ListPrice values into Excel and using the 
Excel median function. 
Pivoting data is a common request when working with business users.  Suppose, for example, that we 
wanted to return how many orders were placed by salesperson by month for the year 2007.  So, we 
want to see a row for each sales person and a column for each month of 2007.  Using methods we have 
already learned about, we can complete the request with the following query: 
WITH SalesDetails 
AS ( 
 SELECT  
    P.FirstName + ' ' + P.LastName AS SalesPerson,  
    MONTH(OrderDate) AS OrderMonth 
 FROM Sales.SalesOrderHeader SOH 
 INNER JOIN Sales.SalesPerson SP 
 ON SP.BusinessEntityID = SOH.SalesPersonID 
 INNER JOIN Person.Person P 
 ON P.BusinessEntityID = SP.BusinessEntityID 
 WHERE SalesPersonID IS NOT NULL 
    AND YEAR(OrderDate) = 2007 
 ),  

T-SQL Querying Guide  © The Knowlton Group, LLC  117 | P a g e    
SalesByMonth AS ( 
 SELECT 
    SalesPerson,  
    CASE 
   WHEN OrderMonth = 1 THEN 1 
   ELSE 0 
    END AS JanuarySales,  
    CASE 
   WHEN OrderMonth = 2 THEN 1 
   ELSE 0 
    END AS FebruarySales,  
    CASE 
   WHEN OrderMonth = 3 THEN 1 
   ELSE 0 
    END AS MarchSales,  
    CASE 
   WHEN OrderMonth = 4 THEN 1 
   ELSE 0 
    END AS AprilSales,  
    CASE 
   WHEN OrderMonth = 5 THEN 1 
   ELSE 0 
    END AS MaySales,  
    CASE 
   WHEN OrderMonth = 6 THEN 1 
   ELSE 0 
    END JuneSales,  
    CASE 
   WHEN OrderMonth = 7 THEN 1 
   ELSE 0 
    END JulySales,  
    CASE 
   WHEN OrderMonth = 8 THEN 1 
   ELSE 0 
    END AS AugustSales,  
    CASE 
   WHEN OrderMonth = 9 THEN 1 
   ELSE 0 
    END AS SeptemberSales,  
    CASE 
   WHEN OrderMonth = 10 THEN 1 
   ELSE 0 
    END AS OctoberSales,  
    CASE 
   WHEN OrderMonth = 11 THEN 1 
   ELSE 0 
    END AS NovemberSales,  
    CASE 
   WHEN OrderMonth = 12 THEN 1 
   ELSE 0 
    END AS DecemberSales 
 FROM SalesDetails 
) 
SELECT 
 SalesPerson,  
 SUM(JanuarySales) AS JanuarySales,  
 SUM(FebruarySales) AS FebruarySales,  

T-SQL Querying Guide  © The Knowlton Group, LLC  118 | P a g e    
 SUM(MarchSales) AS MarchSales,  
 SUM(AprilSales) AS AprilSales,  
 SUM(MaySales) AS MaySales,  
 SUM(JuneSales) AS JuneSales,  
 SUM(JulySales) AS JulySales,  
 SUM(AugustSales) AS AugustSales,  
 SUM(SeptemberSales) AS SeptemberSales,  
 SUM(OctoberSales) AS OctoberSales,  
 SUM(NovemberSales) AS NovemberSales,  
 SUM(DecemberSales) AS DecemberSales 
FROM SalesByMonth SM 
GROUP BY SalesPerson 
This looks like a more intimidating query than it truly is.  The first CTE gathers the full name of each sales 
person and then identifies which month of 2007 each order associated with each sales person was 
placed.  The second CTE then applies a CASE statement for each month.  If the OrderMonth integer 
value equals the month we are interested in, then we assign the value 1 to it.  If the OrderMonth does 
not equal the month we are interested in, we set the value to 0.  We do this so that in the outer query, 
we are able to sum up the values for each month and only count those sales where the order was 
actually placed in the month we are interested in.  If you look at just the values in the SalesByMonth CTE 
without completing any groupings, you will see that each row contains the sales person name and then 
a zero in every column except for the column associated with the month that the order was placed.  
Once the grouping takes placed in the outer query, the sum functions ignore the zero values in each 
column and return the total number of orders placed per employee per month. 
We could return the total sales amount by month for each sales person by modifying the query slightly: 
WITH SalesDetails 
AS ( 
 SELECT  
    P.FirstName + ' ' + P.LastName AS SalesPerson,  
    MONTH(OrderDate) AS OrderMonth, 
    TotalDue 
 FROM Sales.SalesOrderHeader SOH 
 INNER JOIN Sales.SalesPerson SP 
 ON SP.BusinessEntityID = SOH.SalesPersonID 
 INNER JOIN Person.Person P 
 ON P.BusinessEntityID = SP.BusinessEntityID 
 WHERE SalesPersonID IS NOT NULL 
    AND YEAR(OrderDate) = 2007 
 ),  
SalesByMonth AS ( 
 SELECT 
    SalesPerson,  
    CASE 
   WHEN OrderMonth = 1 THEN TotalDue 
   ELSE 0 
    END AS JanuarySales,  
    CASE 
   WHEN OrderMonth = 2 THEN TotalDue 
   ELSE 0 
    END AS FebruarySales,  
    CASE 

T-SQL Querying Guide  © The Knowlton Group, LLC  119 | P a g e    
   WHEN OrderMonth = 3 THEN TotalDue 
   ELSE 0 
    END AS MarchSales,  
    CASE 
   WHEN OrderMonth = 4 THEN TotalDue 
   ELSE 0 
    END AS AprilSales,  
    CASE 
   WHEN OrderMonth = 5 THEN TotalDue 
   ELSE 0 
    END AS MaySales,  
    CASE 
   WHEN OrderMonth = 6 THEN TotalDue 
   ELSE 0 
    END JuneSales,  
    CASE 
   WHEN OrderMonth = 7 THEN TotalDue 
   ELSE 0 
    END JulySales,  
    CASE 
   WHEN OrderMonth = 8 THEN TotalDue 
   ELSE 0 
    END AS AugustSales,  
    CASE 
   WHEN OrderMonth = 9 THEN TotalDue 
   ELSE 0 
    END AS SeptemberSales,  
    CASE 
   WHEN OrderMonth = 10 THEN TotalDue 
   ELSE 0 
    END AS OctoberSales,  
    CASE 
   WHEN OrderMonth = 11 THEN TotalDue 
   ELSE 0 
    END AS NovemberSales,  
    CASE 
   WHEN OrderMonth = 12 THEN TotalDue 
   ELSE 0 
    END AS DecemberSales 
 FROM SalesDetails 
) 
SELECT 
 SalesPerson,  
 SUM(JanuarySales) AS JanuarySales,  
 SUM(FebruarySales) AS FebruarySales,  
 SUM(MarchSales) AS MarchSales,  
 SUM(AprilSales) AS AprilSales,  
 SUM(MaySales) AS MaySales,  
 SUM(JuneSales) AS JuneSales,  
 SUM(JulySales) AS JulySales,  
 SUM(AugustSales) AS AugustSales,  
 SUM(SeptemberSales) AS SeptemberSales,  
 SUM(OctoberSales) AS OctoberSales,  
 SUM(NovemberSales) AS NovemberSales,  
 SUM(DecemberSales) AS DecemberSales 
FROM SalesByMonth SM 
GROUP BY SalesPerson 

T-SQL Querying Guide  © The Knowlton Group, LLC  120 | P a g e    
Instead of returning a 1 when the month of the order matches the month column we are interested in, 
we replace the 1 with the sales price.  Then, when the grouping occurs in the outer query, we aren’t 
summing up ones and zeroes, but are summing up the value of each sale.  Once grouped, this returns 
each sales person’s sales totals by month. 
We could, however, save ourselves from typing quite so much and leverage the PIVOT operator.  This 
allows us to return the same results as the previous query but with a bit less code: 
WITH SalesDetails 
AS ( 
 SELECT  
    P.FirstName + ' ' + P.LastName AS SalesPerson,  
    MONTH(OrderDate) AS OrderMonth, 
    TotalDue 
 FROM Sales.SalesOrderHeader SOH 
 INNER JOIN Sales.SalesPerson SP 
 ON SP.BusinessEntityID = SOH.SalesPersonID 
 INNER JOIN Person.Person P 
 ON P.BusinessEntityID = SP.BusinessEntityID 
 WHERE SalesPersonID IS NOT NULL 
    AND YEAR(OrderDate) = 2007 
 ) 
SELECT  
 SalesPerson, 
 [1] AS JanuarySales, 
 [2] AS FebruarySales, 
 [3] AS MarchSales, 
 [4] AS AprilSales, 
 [5] AS MaySales, 
 [6] AS JuneSales, 
 [7] AS JulySales, 
 [8] AS AugustSales, 
 [9] AS SeptemberSales, 
 [10] AS OctoberSales, 
 [11] AS NovemberSales, 
 [12] AS DecemberSales 
FROM SalesDetails 
PIVOT(SUM(TotalDue) FOR OrderMonth 
IN ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12])) AS P 
With the PIVOT operator, we define the aggregation method we wish to employ – in this case, the SUM 
function – and then specify on which column we wish to pivot.  Since we wish to pivot the data with 
respect to the OrderMonth, we specify it accordingly.  Then we type IN and define the different 
attributes we want to pivot the data by.  Since we are pivoting by the OrderMonth, we need to specify 
the integer values with which SQL will look to pivot.  In this case, the values one through twelve need to 
be specified.  We need to define an alias for the PIVOT operator just like we do with a derived table.  
Assigning column aliases makes the result set easier to interpret, and, in significantly less lines, we are 
able to produce nearly an identical output.  Instead of seeing zeroes in places where a sales person had 
no sales for a given month, we see NULL values.  Wrapping each column with a COALESCE function could 
handle that scenario if you wished to exactly replicate the output. 

T-SQL Querying Guide  © The Knowlton Group, LLC  121 | P a g e    
There are several other topics relating to advanced aggregations and pivoting.  Grouping sets, for 
example, are another method with which to develop grouped result outputs.  MSDN has several pages 
dedicated with descriptions and examples to discussing grouping sets.  They can be found here: 
http://msdn.microsoft.com/en-us/library/ms178544.aspx. 
Using the lessons supplied in this section, do your best to complete the following lab exercises on 
advanced aggregations and pivoting. 
Lab 23: Advanced Aggregations and Pivoting 
1) Using some of the examples from the first part of this section, create a query that returns the 
first day of the month for the current month.  Can you complete a query that always returns the 
first day of the prior month?  What about the last day of the current month? 
2) Using the Sales.SalesOrderHeader table, which day in our sales history had the most sales 
(based on the sales amount and not count)? 
3) Using the NHL_Standings_2014 table we generated in a previous section (see the SQL file 
contained with the guide to create the table if needed), find the median point total. 
4) Using a cumulative running total, find the total number of products that have been ordered 
based on the OrderQty column in the Sales.SalesOrderDetail table for the end of each month 
since the start of our sales history. (Hint: question 4 and 5 are extensions of the running total 
examples we completed in this section) 
5) For the end of each month, find the YTD total number of products ordered based on the 
Sales.SalesOrderDetail table.  

T-SQL Querying Guide  © The Knowlton Group, LLC  122 | P a g e    
Section 18: SQL Variables 
A variable can be thought of as a placeholder for some value to be determined at another time.  
Variables can hold scalar values and even table values depending on how the variable is defined.  The 
classic programming example is to return the string “Hello World”.  We can complete this task using a 
literal select statement: 
SELECT 'Hello World' 
Or, we can define a variable, set the variable’s value equal to the “Hello World” string, and then return 
the value of the variable using a SELECT statement. 
DECLARE @StringVariable VARCHAR(15) 
SET @StringVariable = 'Hello World' 
SELECT @StringVariable 
There are three parts to this variable driven query.  First, we use the DECLARE statement to define the 
name of the variable and the variable’s data type.  All local variables we will work with must begin with 
the “@” symbol.  Each variable must be assigned a specific data type; this is critical for how SQL Server 
will parse and treat the variable as the SELECT statement is evaluated.  Next, we use a SET statement to 
assign a value to the variable. In this case, we are setting the value of @StringVariable equal to the 
string “Hello World”.  The final step is to reference that variable in the SELECT statement. 
As long as you are using a newer version of SQL Server (SQL 2008 and up), you will be able to complete 
an inline variable definition.  This allows you to remove the SET statement and embed the value of the 
variable directly into the DECLARE statement.  So, we could rewrite the previous query and execute: 
DECLARE @StringVariable VARCHAR(15) = 'Hello World' 
SELECT @StringVariable 
Using variables to execute basic string or integer values is relatively meaningless by itself.  However, 
when embedding the variables in a complete SQL statement, we allow ourselves to create incredibly 
dynamic SQL queries.  For example, suppose we created a query that we ran every month.  This query 
would return all sales details for the previous month.  However, every time we run the query, we have 
to change the hard-coded values of the month.  This is an example of what our code might look like 
(running this code will return no values since the data in the database does not have orders past the 
year 2008): 
SELECT SalesOrderID, SalesOrderNumber, OrderDate, TotalDue 
FROM Sales.SalesOrderHeader 
WHERE OrderDate BETWEEN '1/1/2015' AND '1/31/2015' 

T-SQL Querying Guide  © The Knowlton Group, LLC  123 | P a g e    
This can become annoying; having to modify a simple date just to run a new version of a report is a 
redundant task that should be eliminated.  Using SQL variables will allow us to remove the redundancy 
in modifying the dates each month.  By creating two date variables, @StartDate and @EndDate, and 
assigning them the values equal to the first and last of the prior month, respectively, we will never have 
to modify the SQL code to return the proper data: 
DECLARE @StartDate DATE = DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE())-1, 0),  
    @EndDate DATE = DATEADD(DAY, -1, DATEADD(MONTH, DATEDIFF(MONTH, 0, 
GETDATE()), 0)) 
SELECT SalesOrderID, SalesOrderNumber, OrderDate, TotalDue 
FROM Sales.SalesOrderHeader 
WHERE OrderDate BETWEEN @StartDate AND @EndDate 
The @StartDate variable is defined as a DATE data type.  It is then set to the first day of the prior month 
based on the current date.  Similarly, the @EndDate is defined as a DATE variable and is set to 
automatically return the last day of the prior month.  Replacing the hard-coded date values in the 
SELECT statement with the @StartDate and @EndDate variables completes the code rewrite.  At no 
point will we ever have to change anything in this code to return last month’s sales data. 
Variables can also be set to the result of a subquery.  Like when we use the equals operator in a WHERE 
clause with a subquery, the subquery must return a scalar value.  We had previously completed an 
example that found which sale had the highest sale price based on the TotalDue column: 
SELECT SalesOrderID, SalesOrderNumber, OrderDate, TotalDue 
FROM Sales.SalesOrderHeader 
WHERE TotalDue = ( 
 SELECT MAX(TotalDue) 
 FROM Sales.SalesOrderHeader 
) 
We can rewrite this query to accept a parameter value for the largest sale price and adjust the WHERE 
clause accordingly: 
DECLARE @LargestSalePrice MONEY = ( 
 SELECT MAX(TotalDue) 
 FROM Sales.SalesOrderHeader 
 ) 
SELECT SalesOrderID, SalesOrderNumber, OrderDate, TotalDue 
FROM Sales.SalesOrderHeader 
WHERE TotalDue = @LargestSalePrice 
Instead of defining the value of the subquery in the WHERE clause, we assign the value of the resulting 
subquery to the @LargestSalePrice variable.  Then, in the WHERE clause, we replace the subquery with 
the variable which already is storing the result of the subquery as defined in the variable declaration. 

T-SQL Querying Guide  © The Knowlton Group, LLC  124 | P a g e    
SQL variables provide a way to parameterize key components of your code without requiring manual 
modification before runtime.  Having dates automatically generated based on the current date 
eliminates the necessity to constantly change code in scheduled processes.  If you become more 
involved with SQL DBA (database administrator) tasks, having variables involved with T-SQL SQL Agent 
job steps can significantly improve the efficiency of automated tasks. In the next few sections we will 
explore more ways that variables are employed in SQL code. 
Lab 24: SQL Variables 
1) Create a variable named “StringExample” that stores the value “This is a sample string”.  
Complete a SELECT statement that returns the value of this variable. 
2) Create two integer variables “IntegerA” and “IntegerB”.  Assign the values 1 and 4 to the 
variables. Write a SELECT statement that returns the sum of these two variables. 
3) Create a date variable called “MaxOrderDate”.  Set the value of “MaxOrderDate” equal to the 
most recent date from the OrderDate column in the Sales.SalesOrderHeader table.  Write a 
SELECT statement that returns all rows from Sales.SalesOrderHeader where the OrderDate 
equals the “MaxOrderDate” variable. 
4) Create a MONEY variable called “SmallestSalePrice”.  Set the “SmallestSalePrice” variable equal 
to the smallest TotalDue value from Sales.SalesOrderHeader.  Write a SELECT statement that 
returns all rows from Sales.SalesOrderHeader where the TotalDue column equals the value of 
“SmallestSalePrice”. 

T-SQL Querying Guide  © The Knowlton Group, LLC  125 | P a g e    
Section 19: WHILE Loops 
WHILE loops allow you to repeatedly execute a statement based on specified conditions.  For example, if 
we wanted to execute a SQL statement that returns the string “Hello World” five times, we could write 
the code: 
DECLARE @Count INT = 1 
WHILE @Count <= 5 
BEGIN 
 SELECT 'Hello World' 
 SET @Count += 1 
END 
The DECLARE statement defines our iterating variable.  We will set this variable to the value of 1 initially.  
The WHILE statement defines the criteria by which the code will execute until the condition is no longer 
met; in this case, the criteria is that the @Count integer must be less than or equal to five in order for 
the SELECT statement to execute.  We then define the code we want to execute within the WHILE 
statement after specifying the operator BEGIN.  After the SELECT statement, we use a SET statement to 
add one to the @Count variable after each iteration.  This allows for the @Count variable to be one 
integer greater each time the SELECT statement is executed.  The WHILE loop is then closed off with the 
END operator.   
The SET statement within the WHILE loop is incredibly important.  Without that statement, our code will 
never cease.  Since the WHILE loop is set to run as long as the @Count variable is less than or equal to 
five, then if we never increase the value of @Count, it will always remain at the value of 1.  If it always 
remains at the value of one, then the condition is always met and the code will run endlessly.   
To see this infinite loop, remove the SET statement from within the WHILE loop: 
DECLARE @Count INT = 1 
WHILE @Count <= 5 
BEGIN 
 SELECT 'Hello World' 
END 
The code will continue to run until an error was reached within the machine or you manually stop the 
code.  Click the red square to the right of the “Execute” button within SQL Server Management Studio to 
stop the code execution. 
Suppose we wanted to create a temporary table that input 50 random integers.  This guide does not 
spend any time discussing how to update, insert, or delete data, so don’t worry about those parts of the 
code. 
CREATE TABLE #TempTable( 
 IntegerValue INT 
) 

T-SQL Querying Guide  © The Knowlton Group, LLC  126 | P a g e    
GO 
DECLARE @Count INT = 1 
WHILE @Count <= 50 
BEGIN 
 INSERT INTO #TempTable(IntegerValue) 
 SELECT RAND()*25+1 
 SET @Count += 1 
END 
SELECT * 
FROM #TempTable 
DROP TABLE #TempTable 
The first portion of the code creates a temporary table.  This is a table that exists virtually (it is not 
stored to disk) and is only available in that query window.  Any other query window will not be able to 
use the temporary table, nor would any other use connected to the instance.  We declare the @Count 
variable and set its value equal to one.  Defining the WHILE loop criteria, we then specify that we will 
insert a value into the #TempTable temporary table in the IntegerValue column.  That value will be a 
random integer between 1 and 25 as specified in the SELECT statement.  After the value has been 
inserted, we increment the value of @Count by one until.  Finally, we return all of the newly inserted 
data from #TempTable.   
We can iterate over multiple variables if we needed to.  Suppose we created a table that stored website 
traffic for fifty users who accessed one of several sites between one and five times.  If we wanted to 
generate sample data to be able to test some queries, we could use variables and WHILE loops to 
generate the sample table: 
CREATE TABLE #WebTraffic( 
 UserID INT, 
 UserAccessCountID INT, 
 WebSiteName VARCHAR(50),   
 AccessDate DATETIME 
) 
GO  
DECLARE @UserID INT = 1,  
    @UserAccessCountID INT = 1,  
    @WebSiteID INT = 1 
WHILE @UserID <= 50 
BEGIN 
 WHILE @UserAccessCountID <= RAND()*5+1 
 BEGIN 
    SET @WebSiteID = RAND()*3+1 
    INSERT INTO #WebTraffic 
    SELECT 
   @UserID,  

T-SQL Querying Guide  © The Knowlton Group, LLC  127 | P a g e    
   @UserAccessCountID,  
   CASE 
        WHEN @WebSiteID = 1 THEN 'www.espn.com' 
        WHEN @WebSiteID = 2 THEN 'www.google.com' 
        ELSE 'www.yahoo.com' 
   END,  
   DATEADD(SECOND, RAND()*86400+1, '1/14/2015') 
    SET @UserAccessCountID += 1 
 END 
 SET @UserAccessCountID = 1 
 SET @UserID += 1 
END 
SELECT * 
FROM #WebTraffic 
DROP TABLE #WebTraffic 
After creating the #WebTraffic temporary table, we declare three variables: @UserID, 
@UserAccessCountID, and @WebSiteID.  The @UserID variable stored the ID value for the one of fifty 
users in the sample data we are generating.  The @UserAccessCountID variable stores the number that 
will iterate the number of sites accessed by each individual user.  The @WebSiteID variable stores a 
random value that is used in conjunction with a CASE statement to determine the site that each user 
visited.  The outer WHILE loop is driven by the @UserID.  The inner WHILE loop is driven by the 
@UserAccessCountID which represents the number of visits each user makes.  After our INSERT 
statement, we increment the @UserAccessCountID variable by one.  The END operator closes out the 
inner WHILE loop properly.  You will then notice the next statement is “SET @UserAccessCountID = 1” 
which seems a little odd compared to what we have done in the past.  However, if you think about what 
is going on which this variable in the inner WHILE loop it might become clear.  We specify that the 
INSERT statement will execute as long as “@UserAccessCountID <= RAND()*5+1”.  Let’s assume, for the 
sake of argument, the RAND()*5+1 function evaluated to five.  Then, as long as @UserAccessCountID is 
less than or equal to five the INSERT statement will execute. After the fifth INSERT statement execution, 
the @UserAccessCountID holds a value of six.  Since six is not less than or equal to five, then the inner 
WHILE loop does not execute.  If we only incremented the @UserID variable by one to create fake data 
for a new user, the @UserAccessCountID variable would still be set to six.  Since we specified that the 
inner WHILE loop will run if the @UserAccessCountID is less than or equal to a random value between 
one and five, then the inner WHILE loop will not run again for any of the remaining users.  To handle 
this, we reset the @UserAccessCountID to one and begin the iteration process again.  After each inner 
WHILE loop is complete, we increment the @UserID variable by one until we have inserted data for all 
fifty users.  The subtlety in resetting the inner iterating variable is important whenever you work with 
nested WHILE loops. 
Those with prior programming background will recognize many of these techniques.  In fact, the 
techniques involved with other programming languages are very similar; it simply is a matter of learning 
the proper syntax involved in writing T-SQL code.  

T-SQL Querying Guide  © The Knowlton Group, LLC  128 | P a g e    
Lab 25: WHILE Loops 
1) Create a query that returns the string “This is a T-SQL guide” five times using a WHILE loop. 
2) Modify the following code to insert 30 rows with a random integer value between one and 
fourty. 
CREATE TABLE #TempTable( 
 IntegerValue INT 
) 
GO 
DECLARE @Count INT = 1 
WHILE @Count <= 50 
BEGIN 
 INSERT INTO #TempTable(IntegerValue) 
 SELECT RAND()*25+1 
 SET @Count += 1 
END 
SELECT * 
FROM #TempTable 
DROP TABLE #TempTable 
3) Modify the previous code to insert ten rows with a random integer between one and ten. 
4) Modify the following query to insert fake web traffic data for thirty users with each user 
accessing a site between one and ten times. 
 CREATE TABLE #WebTraffic( 
 UserID INT, 
 UserAccessCountID INT, 
 WebSiteName VARCHAR(50),   
 AccessDate DATETIME 
) 
GO  
DECLARE @UserID INT = 1,  
  @UserAccessCountID INT = 1,  
  @WebSiteID INT = 1 
WHILE @UserID <= 50 
BEGIN 
 WHILE @UserAccessCountID <= RAND()*5+1 
 BEGIN 
  SET @WebSiteID = RAND()*3+1 
  INSERT INTO #WebTraffic 
  SELECT 
   @UserID,  
   @UserAccessCountID,  
   CASE 
    WHEN @WebSiteID = 1 THEN 'www.espn.com' 

T-SQL Querying Guide  © The Knowlton Group, LLC  129 | P a g e    
    WHEN @WebSiteID = 2 THEN 'www.google.com' 
    ELSE 'www.yahoo.com' 
   END,  
   DATEADD(SECOND, RAND()*86400+1, '1/14/2015') 
  SET @UserAccessCountID += 1 
 END 
 SET @UserAccessCountID = 1 
 SET @UserID += 1 
END 
SELECT * 
FROM #WebTraffic 
DROP TABLE #WebTraffic 

T-SQL Querying Guide  © The Knowlton Group, LLC  130 | P a g e    
Appendix A: Solutions for Lab Questions 
Lab 1: Literal SELECT Statements 
Question 1: 
SELECT 'FirstName LastName' 
Question 2: 
SELECT 7*4 
Question 3: 
SELECT (7-4)*8 
Question 4: 
SELECT 'The Knowlton Group''s SQL Training Class' 
Question 5: 
SELECT 'Day 1 of Training', 5*3 
Lab 2: Basic SELECT Statements 
Question 1: 
SELECT NationalIDNumber 
FROM HumanResources.Employee 
Question 2: 
SELECT NationalIDNumber, JobTitle 
FROM HumanResources.Employee 
Question 3: 
SELECT TOP 20 PERCENT  
 NationalIDNumber, JobTitle, BirthDate 
FROM HumanResources.Employee 
Question 4: 
SELECT TOP 500  
 NationalIDNumber AS SSN,  
 JobTitle AS [Job Title],  
 BirthDate 
FROM HumanResources.Employee 
Question 5: 
SELECT * 
FROM Sales.SalesOrderHeader 
Question 6: 
SELECT TOP 50 PERCENT * 
FROM Sales.Customer 

T-SQL Querying Guide  © The Knowlton Group, LLC  131 | P a g e    
Question 7: 
SELECT Name AS [Product's Name] 
FROM Production.vProductAndDescription 
Question 8: 
SELECT TOP 400 * 
FROM HumanResources.Department 
Question 9: 
SELECT * 
FROM Production.BillOfMaterials 
Question 10: 
SELECT TOP 1500 * 
FROM Sales.vPersonDemographics 
Lab 3: Using the WHERE Clause Part 1 
Question 1: 
SELECT FirstName, LastName 
FROM Person.Person 
WHERE FirstName = 'Mark' 
Question 2: 
SELECT TOP 100 * 
FROM Production.Product 
WHERE ListPrice <> 0.00 
Question 3: 
SELECT * 
FROM HumanResources.vEmployee 
WHERE LastName < 'D' 
Question 4: 
SELECT * 
FROM Person.StateProvince 
WHERE CountryRegionCode = 'CA' 
Question 5: 
SELECT  
 FirstName AS "Customer First Name",  
 LastName AS "Customer Last Name" 
FROM Sales.vIndividualCustomer 
WHERE LastName = 'Smith' 
Lab 4: Symbolic Logic and Truth Tables 
Question 1: 

T-SQL Querying Guide  © The Knowlton Group, LLC  132 | P a g e    
A 
B 
A and B 
T 
T 
T 
T 
F 
F 
F 
T 
F 
F 
F 
F 
Question 2: 
A 
B 
A or B 
T 
T 
T 
T 
F 
T 
F 
T 
T 
F 
F 
F 
Question 3: 
A or B 
C 
(A or B) and C 
T 
T 
T 
T 
F 
F 
F 
T 
F 
F 
F 
F 
Question 4: 
Yes, that is a possibility.  The query is evaluated as: 
SELECT COUNT(*) 
FROM HumanResources.vEmployee 
WHERE FirstName < 'K' OR (PhoneNumberType = 'Cell' AND EmailPromotion = 1) 
If we were to create a truth table for this query it would look like the following: 
A 
B and C 
A or (B and C) 
T 
T 
T 
T 
F 
F 
F 
T 
F 
F 
F 
F 
So, the only time a false row would exist for (B and C) – which represents a scenario like 
PhoneNumberType equals “Work” and EmailPromotion equals “0” – is if A was true – in this case, if the 
FirstName is less than “K”.  If you look at the instances where the EmailPromotion is 0 and the 
PhoneNumberType is “Work”, you will find that each FirstName value begins with a letter less than “K”.  
This is why that possibility exists in the results as the query is written. 
Question 5: 
This would not be possible.  If you look at the truth table in the solution explanation for question 4, the 
only time that a row would exist where the statement (B and C) is false would be if the FirstName 

T-SQL Querying Guide  © The Knowlton Group, LLC  133 | P a g e    
column began with a letter less than “K”.  Since the PhoneNumberType being something other than 
“Cell” would make the conjunction – (B and C) – false, then the FirstName column must begin with a 
letter less than “K” or the row would not be returned in the results. 
Lab 5: Using the WHERE Clause Part 2 
Question 1: 
SELECT * 
FROM Sales.vIndividualCustomer 
WHERE CountryRegionName = 'Australia' OR 
 (PhoneNumberType = 'Cell' AND EmailPromotion = 0) 
Question 2: 
SELECT * 
FROM HumanResources.vEmployeeDepartment 
WHERE Department IN ('Executive', 'Tool Design', 'Engineering') 
SELECT * 
FROM HumanResources.vEmployeeDepartment 
WHERE Department = 'Executive' OR Department = 'Tool Design' 
 OR Department = 'Engineering' 
Question 3: 
SELECT * 
FROM HumanResources.vEmployeeDepartment 
WHERE StartDate BETWEEN '7/1/2000' AND '6/30/2002' 
SELECT * 
FROM HumanResources.vEmployeeDepartment 
WHERE StartDate >= '7/1/2000' AND StartDate <= '6/30/2002' 
Question 4: 
SELECT * 
FROM Sales.vIndividualCustomer 
WHERE LastName LIKE 'R%' 
Question 5: 
SELECT * 
FROM Sales.vIndividualCustomer 
WHERE LastName LIKE '%r' 
Question 6: 
SELECT * 
FROM Sales.vIndividualCustomer 
WHERE LastName IN ('Lopez', 'Martin', 'Wood') AND 
 FirstName LIKE '[C-L]%' 
Question 7: 
SELECT * 
FROM Sales.SalesOrderHeader 
WHERE SalesPersonID IS NOT NULL 

T-SQL Querying Guide  © The Knowlton Group, LLC  134 | P a g e    
Question 8: 
SELECT 
 SalesPersonID,  
 TotalDue 
FROM Sales.SalesOrderHeader 
WHERE SalesPersonID IS NOT NULL 
 AND TotalDue > 70000 
Lab 6: Sorting Using the ORDER BY Clause 
Question 1: 
SELECT  
 FirstName,  
 LastName,  
 JobTitle 
FROM HumanResources.vEmployeeDepartment 
ORDER BY FirstName ASC 
Question 2: 
SELECT  
 FirstName,  
 LastName,  
 JobTitle 
FROM HumanResources.vEmployeeDepartment 
ORDER BY FirstName, LastName DESC 
Question 3: 
SELECT  
 FirstName,  
 LastName,  
 CountryRegionName 
FROM Sales.vIndividualCustomer 
ORDER BY 3 
Question 4: 
SELECT 
 FirstName,  
 LastName,  
 CountryRegionName 
FROM Sales.vIndividualCustomer 
WHERE CountryRegionName IN ('United States', 'France') 
ORDER BY CountryRegionName 
Question 5: 
SELECT  
 Name,  
 AnnualSales,  
 YearOpened,  
 SquareFeet AS [Store Size],  
 NumberEmployees AS [Total Employees] 
FROM Sales.vStoreWithDemographics 

T-SQL Querying Guide  © The Knowlton Group, LLC  135 | P a g e    
WHERE AnnualSales > 1000000 
 AND NumberEmployees >= 45 
ORDER BY [Store Size] DESC, [Total Employees] DESC 
Lab 7: INNER JOIN Practice 
Question 1: 
SELECT  
 P.FirstName,  
 P.LastName,  
 PP.PasswordHash 
FROM Person.Person P 
INNER JOIN Person.[Password] PP 
ON PP.BusinessEntityID = P.BusinessEntityID 
Question 2: 
SELECT  
 E.BusinessEntityID,  
 E.NationalIDNumber,  
 E.JobTitle,  
 EDH.DepartmentID,  
 EDH.StartDate,  
 EDH.EndDate 
FROM HumanResources.Employee E 
INNER JOIN HumanResources.EmployeeDepartmentHistory EDH 
ON E.BusinessEntityID = EDH.BusinessEntityID 
Since the HumanResources.Employee table has 290 rows, you might expect the INNER JOIN of that table 
with HumanResources.EmployeeDepartmentHistory would have no more than 290 rows.  However, 
there are multiple instances of a few BusinessEntityID values in 
HumanResources.EmployeeDepartmentHistory.  Instead of matching a single row from 
HumanResources.Employee to a single row in HumanResources.EmployeeDepartmentHistory, a row in 
HumanResources.Employee is matching to potentially two rows.  This means that multiple rows are 
returned for the duplicate BusinessEntityID values.  The lowest level of detail of uniqueness is often 
referred to as the grain of a table.  Making sure that your joins account for potential grain differences is 
critical to produce proper queries. 
Question 3: 
SELECT  
 P.FirstName,  
 P.LastName,  
 PP.PasswordHash,  
 E.EmailAddress 
FROM Person.Person P 
INNER JOIN Person.[Password] PP 
ON PP.BusinessEntityID = P.BusinessEntityID 
INNER JOIN Person.EmailAddress E 
ON E.BusinessEntityID = P.BusinessEntityID 
Question 4: 
SELECT  

T-SQL Querying Guide  © The Knowlton Group, LLC  136 | P a g e    
 B.Title,  
 B.ISBN,  
 A.AuthorName 
FROM BookAuthor BA 
INNER JOIN Book B 
ON B.BookID = BA.BookID 
INNER JOIN Author A 
ON A.AuthorID = BA.AuthorID 
Question 5: 
SELECT  
 B.Title,  
 B.ISBN,  
 A.AuthorName,  
 P.PublisherName 
FROM BookAuthor BA 
INNER JOIN Book B 
ON B.BookID = BA.BookID 
INNER JOIN Author A 
ON A.AuthorID = BA.AuthorID 
INNER JOIN Publisher P 
ON P.PublisherID = B.PublisherID 
Lab 8: LEFT OUTER JOINs and RIGHT OUTER JOINs 
Question 1: 
SELECT  
 SP.BusinessEntityID,  
 SP.SalesYTD,  
 ST.Name AS [Territory Name] 
FROM Sales.SalesPerson SP 
LEFT OUTER JOIN Sales.SalesTerritory ST 
ON ST.TerritoryID = SP.TerritoryID 
Question 2: 
SELECT  
 P.FirstName,  
 P.LastName, 
 SP.BusinessEntityID,  
 SP.SalesYTD,  
 ST.Name AS [Territory Name] 
FROM Sales.SalesPerson SP 
LEFT OUTER JOIN Sales.SalesTerritory ST 
ON ST.TerritoryID = SP.TerritoryID 
INNER JOIN Person.Person P 
ON P.BusinessEntityID = SP.BusinessEntityID 
WHERE ST.Name IN ('Northeast', 'Central') 
Question 3: 
SELECT  
 P.Name,  
 P.ListPrice,  
 SC.Name AS ProductSubcategoryName,  
 C.Name AS ProductCategoryName 

T-SQL Querying Guide  © The Knowlton Group, LLC  137 | P a g e    
FROM Production.Product P 
LEFT OUTER JOIN Production.ProductSubcategory SC 
ON SC.ProductSubcategoryID = P.ProductSubcategoryID 
LEFT OUTER JOIN Production.ProductCategory C 
ON C.ProductCategoryID = SC.ProductCategoryID 
ORDER BY ProductCategoryName DESC, ProductSubcategoryName ASC 
Lab 9: Aggregate Functions 
Question 1: 
SELECT COUNT(*) 
FROM Person.Person 
Question 2: 
SELECT COUNT(MiddleName) 
FROM Person.Person 
Question 3: 
SELECT AVG(StandardCost) 
FROM Production.Product 
WHERE StandardCost > 0 
Question 4: 
SELECT AVG(Freight) 
FROM Sales.SalesOrderHeader 
WHERE TerritoryID = 4 
Question 5: 
SELECT MAX(ListPrice) 
FROM Production.Product 
Question 6: 
SELECT SUM(P.ListPrice*I.Quantity) 
FROM Production.Product P 
INNER JOIN Production.ProductInventory I 
ON I.ProductID = P.ProductID 
WHERE P.ListPrice > 0 
Lab 10: Grouping with the GROUP BY Clause 
Question 1: 
SELECT  
 PersonType,  
 COUNT(*) AS PersonCount 
FROM Person.Person 
GROUP BY PersonType 
Question 2: 
SELECT  
 Color,  

T-SQL Querying Guide  © The Knowlton Group, LLC  138 | P a g e    
 COUNT(*) AS ProductCount 
FROM Production.Product 
WHERE Color IN ('Red', 'Black') 
GROUP BY Color 
Question 3: 
SELECT  
 TerritoryID,  
 COUNT(*) AS SalesCount 
FROM Sales.SalesOrderHeader  
WHERE OrderDate BETWEEN '7/1/2005' AND '12/31/2006' 
GROUP BY TerritoryID 
Question 4: 
SELECT  
 ST.Name AS TerritoryName,  
 COUNT(*) AS SalesCount 
FROM Sales.SalesOrderHeader SOH 
LEFT OUTER JOIN Sales.SalesTerritory ST 
ON ST.TerritoryID = SOH.TerritoryID 
WHERE OrderDate BETWEEN '7/1/2005' AND '12/31/2006' 
GROUP BY ST.Name 
Question 5: 
SELECT  
 A.AuthorName,  
 COUNT(*) AS BookCount 
FROM BookAuthor BA 
INNER JOIN Author A 
ON A.AuthorID = BA.AuthorID 
GROUP BY A.AuthorName 
Lab 11: Filtering Groups with the GROUP BY Clause 
Question 1: 
SELECT  
 TerritoryID,  
 SUM(TotalDue) AS TotalSales 
FROM Sales.SalesOrderHeader 
GROUP BY TerritoryID 
HAVING SUM(TotalDue) > 10000000 
Question 2: 
SELECT  
 ST.Name AS TerritoryName,  
 SUM(TotalDue) AS TotalSales 
FROM Sales.SalesOrderHeader SOH 
LEFT OUTER JOIN Sales.SalesTerritory ST 
ON ST.TerritoryID = SOH.TerritoryID 
GROUP BY ST.Name 
HAVING SUM(TotalDue) > 10000000 
Question 3: 

T-SQL Querying Guide  © The Knowlton Group, LLC  139 | P a g e    
SELECT  
 Color,  
 COUNT(*) AS ProductCount 
FROM Production.Product 
WHERE Color IS NOT NULL 
GROUP BY Color 
HAVING COUNT(*) >= 20 
Question 4: 
SELECT  
 P.Name AS [Product Name],  
 SUM(SOD.OrderQty) AS ProductOrderCount 
FROM Sales.SalesOrderHeader SOH 
INNER JOIN Sales.SalesOrderDetail SOD 
ON SOD.SalesOrderID = SOH.SalesOrderID 
INNER JOIN Production.Product P 
ON P.ProductID = SOD.ProductID 
WHERE SOH.OrderDate BETWEEN '1/1/2006' AND '12/31/2006' 
GROUP BY P.Name 
HAVING SUM(SOD.OrderQty) >= 200 
Question 5: 
SELECT  
 P.FirstName,  
 P.LastName,  
 COUNT(*) AS OrdersPlaced 
FROM Sales.SalesOrderHeader SOH 
INNER JOIN Sales.Customer C 
ON C.CustomerID = SOH.CustomerID 
INNER JOIN Person.Person P 
ON P.BusinessEntityID = C.PersonID 
WHERE OrderDate BETWEEN '7/1/2005' AND '12/31/2006' 
GROUP BY P.FirstName, P.LastName 
HAVING COUNT(*) >= 6 
ORDER BY OrdersPlaced DESC 
Lab 12: String Functions and Nested Functions 
Question 1: 
SELECT LEFT('This is a basic string', 8) 
Question 2: 
SELECT RIGHT('This is another string', 6) 
Question 3: 
SELECT CHARINDEX('e', Name) 
FROM Production.Product 
Question 4: 
SELECT SUBSTRING(Name, 3, 4) 
FROM Sales.SalesTerritory 
Question 5: 

T-SQL Querying Guide  © The Knowlton Group, LLC  140 | P a g e    
SELECT LEFT(RIGHT('This is a slightly longer string', 8), 4) 
Question 6: 
SELECT  
 LEFT(FirstName, CHARINDEX('e', FirstName, 0)) 
FROM Person.Person 
WHERE LEFT(FirstName, CHARINDEX('e', FirstName, 0)) <> '' 
Lab 13: Date and Time Built-In Functions 
Question 1: 
SELECT DATEPART(MONTH, '6/12/2011') 
SELECT MONTH('6/12/2011') 
Question 2: 
SELECT DATEPART(YEAR, '11/20/1992') 
SELECT YEAR('11/20/1992') 
Question 3: 
SELECT DATEADD(DAY, -74, GETDATE()) 
Question 4: 
SELECT GETDATE() 
SELECT GETUTCDATE() 
SELECT SYSDATETIME() 
Question 5: 
SELECT DATEDIFF(DAY, '4/17/1996', '9/4/2001') 
Question 6: 
SELECT DATEDIFF(MONTH, '12/25/1993', DATEADD(DAY, -2719, GETDATE())) 
Lab 14: NULL Handling Functions 
Question 1: 
SELECT  
 COALESCE(Title, 'No Title Listed') AS Title 
FROM Person.Person 
Question 2: 
SELECT  
 COALESCE(MiddleName, 'No Middle Name Listed') AS MiddleName 
FROM Person.Person 

T-SQL Querying Guide  © The Knowlton Group, LLC  141 | P a g e    
Question 3: 
SELECT  
 COALESCE(FirstName + ' ' + MiddleName + ' ' + LastName,  
    FirstName + ' ' + LastName) AS FullName 
FROM Person.Person 
Question 4: 
SELECT NULLIF(MakeFlag, FinishedGoodsFlag) 
FROM Production.Product 
Lab 15: SQL Server Data Types & Type Casting 
Question 1: 
SELECT 
 CAST(FirstName AS VARCHAR) 
FROM Person.Person 
Question 2: 
SELECT  
 11/4,  
 CAST(11 AS FLOAT)/CAST(4 AS FLOAT),  
  11.0/4.0 
Question 3: 
SELECT 
 CAST(FirstName AS VARCHAR(3)) 
FROM Person.Person 
Question 4: 
SELECT  
 CAST(Size AS INT) 
FROM Production.Product 
An error is returned stating “conversion failed when converting the nvarchar value ‘M’ to data type int.”.  
This is because there are some values of the Size column that contain sizes like “S”, “M”, “L”, etc.  These 
values are not numeric and cannot be converted to an integer as there is certainly no integer equivalent 
for the value of the letter “S” or “M”. 
Question 5: 
SELECT  
 CAST(Size AS INT) 
FROM Production.Product 
WHERE ISNUMERIC(Size) = 1 
With the addition of the WHERE clause and the appropriate filtering condition, we do not receive an 
error this time.  The ISNUMERIC() function looks at a value and outputs a 1 if the value is numeric and a 
0 if the value is not numeric.  By specifying that we only want to evaluate numeric values, we bypass the 
error where we cannot convert non-numeric value to an integer.  Since the WHERE clause is evaluated 

T-SQL Querying Guide  © The Knowlton Group, LLC  142 | P a g e    
before the SELECT clause by the database, the CAST() function will not evaluate any values like “S”, “M”, 
or “L” which caused the errors in the previous question. 
Lab 16: Using Derived Tables 
Question 1: 
The query will not execute because a required condition of a table expression is that each column must 
have a name specified.  The “YEAR(HireDate)” column does not have an alias specified, therefore the 
table expression cannot be evaluated and the query will fail. 
Question 2: 
This query will execute without an issue.  ORDER BY clauses are not allowed in table expressions unless 
the TOP operator is used within the expression.  Since the TOP operator is included in this table 
expression the ORDER BY clause is allowed. 
Question 3: 
SELECT * 
FROM ( 
 SELECT *, YEAR(HireDate) AS HireYear, YEAR(BirthDate) AS BirthYear 
 FROM HumanResources.Employee 
 ) AS Emp 
WHERE HireYear >= 2006 AND BirthYear <= 1968 
Question 4: 
SELECT  
 OrderYear,  
 SUM(TotalDue) AS TotalSales 
FROM ( 
 SELECT *, YEAR(OrderDate) AS OrderYear 
 FROM Sales.SalesOrderHeader 
 ) AS SalesOrders 
WHERE OrderYear IN (2005, 2006) 
GROUP BY OrderYear 
Lab 17: Common Table Expressions 
Question 1: 
This query will fail at runtime because the first column of the CTE does not contain a name.  Since every 
column in a table expression must have a name specified, the lack of a column alias will cause the query 
to return an error. 
Question 2: 
WITH Hires 
AS ( 
 SELECT  
    YEAR(HireDate) AS HireYear,  
    BusinessEntityID 

T-SQL Querying Guide  © The Knowlton Group, LLC  143 | P a g e    
 FROM HumanResources.Employee 
), HireByYear AS ( 
 SELECT  
    HireYear,  
    COUNT(*) AS HireCount 
 FROM Hires 
 GROUP BY HireYear 
) 
SELECT  
 H1.HireYear,  
 H1.HireCount AS CurrentYearHireCount,  
 H2.HireCount AS PriorYearHireCount 
FROM HireByYear H1 
LEFT OUTER JOIN HireByYear H2 
ON H1.HireYear = H2.HireYear + 1 
Question 3: 
WITH Products 
AS ( 
 SELECT  
    YEAR(SellStartDate) AS ProductSellStartYear,  
    ProductID 
 FROM Production.Product 
) 
SELECT 
 ProductSellStartYear,  
 COUNT(*) AS ProductCount 
FROM Products 
GROUP BY ProductSellStartYear 
Question 4: 
WITH SalesMonth 
AS ( 
 SELECT 
    MONTH(OrderDate) AS OrderMonth, 
    TotalDue 
 FROM Sales.SalesOrderHeader 
 WHERE YEAR(OrderDate) = 2006 
) 
SELECT  
 OrderMonth,  
 SUM(TotalDue) AS MonthlySales 
FROM SalesMonth 
GROUP BY OrderMonth 
ORDER BY 1 
Lab 18: CASE Statements 

T-SQL Querying Guide  © The Knowlton Group, LLC  144 | P a g e    
Question 1: 
SELECT  
 FirstName,  
 LastName,  
 CASE 
    WHEN EmailPromotion = 0 THEN 'Promotion 1' 
    WHEN EmailPromotion = 1 THEN 'Promotion 2' 
    WHEN EmailPromotion = 2 THEN 'Promotion 3' 
 END AS PromotionName 
FROM Person.Person 
Question 2: 
If a value does not meet any of the conditions in a CASE statement, the output to be returned will be a 
NULL value. 
Question 3: 
SELECT  
 CASE 
    WHEN LEN(FirstName) >= 10 THEN 'Long Name' 
    ELSE 'Short Name' 
 END 
FROM Person.Person 
Question 4: 
WITH SalesRanges 
AS ( 
 SELECT  
    CASE 
   WHEN TotalDue BETWEEN 0 AND 149.99 
        THEN '$0 to $149.99' 
   WHEN TotalDue BETWEEN 150 AND 499.99 
        THEN '$150 to $499.99' 
   WHEN TotalDue BETWEEN 500 AND 4999.99 
        THEN '$500 to $4,999.99' 
   WHEN TotalDue BETWEEN 5000 AND 24999.99 
        THEN '$5,000 to $24,999.99' 
   ELSE 'Over $25,000' 
    END AS SalesPriceRange,  
    SalesOrderID 
 FROM Sales.SalesOrderHeader 
) 
SELECT  
 SalesPriceRange,  
 COUNT(*) 
FROM SalesRanges 
GROUP BY SalesPriceRange 
Question 5: 
SELECT  
 CASE 
    WHEN Color IS NULL THEN 'No Color' 

T-SQL Querying Guide  © The Knowlton Group, LLC  145 | P a g e    
    ELSE Color 
 END AS Color 
FROM Production.Product 
SELECT 
 COALESCE(Color, 'No Color') 
FROM Production.Product 
Lab 19: Ranking Functions 
Question 1: 
WITH SalesDetails 
AS ( 
 SELECT 
    CustomerID,  
    SUM(TotalDue) AS TotalSales 
 FROM Sales.SalesOrderHeader 
 WHERE YEAR(OrderDate) = 2005 
 GROUP BY CustomerID 
) 
SELECT  
 CustomerID,  
 ROW_NUMBER() OVER(ORDER BY TotalSales DESC) AS CustomerSalesRank,  
 TotalSales 
FROM SalesDetails 
Question 2: 
SELECT  
 P.ProductID,  
 OrderQuantity, 
 RANK() OVER(ORDER BY OrderQuantity DESC) 
FROM Production.Product P 
LEFT OUTER JOIN ( 
 SELECT SOD.ProductID, SUM(SOD.OrderQty) AS OrderQuantity 
 FROM Sales.SalesOrderHeader SOH 
 INNER JOIN Sales.SalesOrderDetail SOD 
 ON SOH.SalesOrderID = SOD.SalesOrderID 
 WHERE YEAR(OrderDate) = 2006 
 GROUP BY ProductID 
 ) SP 
ON SP.ProductID = P.ProductID 
Question 3: 
SELECT  
 P.ProductID,  
 OrderQuantity, 
 DENSE_RANK() OVER(ORDER BY OrderQuantity DESC) 
FROM Production.Product P 
LEFT OUTER JOIN ( 
 SELECT SOD.ProductID, SUM(SOD.OrderQty) AS OrderQuantity 
 FROM Sales.SalesOrderHeader SOH 
 INNER JOIN Sales.SalesOrderDetail SOD 
 ON SOH.SalesOrderID = SOD.SalesOrderID 
 WHERE YEAR(OrderDate) = 2006 

T-SQL Querying Guide  © The Knowlton Group, LLC  146 | P a g e    
 GROUP BY ProductID 
 ) SP 
ON SP.ProductID = P.ProductID 
Question 4: 
SELECT  
 TerritoryID,  
 NTILE(2) OVER(ORDER BY SUM(TotalDue) DESC) AS TerritoryTier 
FROM Sales.SalesOrderHeader 
WHERE YEAR(OrderDate) = 2006 
GROUP BY TerritoryID 
Question 5: 
WITH TerritoryRank 
AS ( 
 SELECT  
    TerritoryID,  
    NTILE(2) OVER(ORDER BY SUM(TotalDue) DESC) AS TerritoryTier 
 FROM Sales.SalesOrderHeader 
 WHERE YEAR(OrderDate) = 2006 
 GROUP BY TerritoryID 
 ) 
SELECT  
 TerritoryID, 
 CASE 
    WHEN TerritoryTier = 1 THEN 'Top Territory Group' 
    ELSE 'Bottom Territory Group' 
 END AS TerritoryTier 
FROM TerritoryRank 
Lab 20: Set Operations 
Question 1: 
SELECT ProductID, Name 
FROM Production.Product 
WHERE ProductID < 325 
UNION 
SELECT ProductID, Name 
FROM Production.Product 
WHERE ProductID >= 325 
Question 2: 
SELECT ProductID, Name 
FROM Production.Product 
WHERE ProductID < 425 
UNION 
SELECT ProductID, Name 
FROM Production.Product 
WHERE ProductID >= 325 

T-SQL Querying Guide  © The Knowlton Group, LLC  147 | P a g e    
Question 3: 
SELECT ProductID, Name 
FROM Production.Product 
WHERE ProductID < 425 
UNION ALL 
SELECT ProductID, Name 
FROM Production.Product 
WHERE ProductID >= 325 
The results for question 3 contain more rows than the results in question 2 due to the ALL operator 
added to the UNION set operation.  This is because the UNION ALL operation no longer returns a distinct 
list of values.  Since there is overlap between the two queries, any product with a ProductID value 
between 325 and 424 (inclusive) contains two rows in the results. 
Question 4: 
SELECT ProductID, Name 
FROM Production.Product 
WHERE ProductID < 425 
INTERSECT 
SELECT ProductID, Name 
FROM Production.Product 
WHERE ProductID >= 325 
Question 5: 
SELECT ProductID, Name 
FROM Production.Product 
WHERE ProductID < 425 
EXCEPT 
SELECT ProductID, Name 
FROM Production.Product 
WHERE ProductID >= 325 
Based on the order in which the queries appear, the EXCEPT operator finds all rows with a ProductID 
less than 425 and then returns only those rows that do not appear in the list of all rows with a ProductID 
greater than or equal to 325.  The 13 rows of the result set clearly indicate this; the ProductID’s returned 
are all between 1 and 324. 
Question 6: 
SELECT ProductID, Name 
FROM Production.Product 
WHERE ProductID >= 325 
EXCEPT 
SELECT ProductID, Name 

T-SQL Querying Guide  © The Knowlton Group, LLC  148 | P a g e    
FROM Production.Product 
WHERE ProductID < 425 
With the order of the two queries switched, the result is dramatically different.  The query is now finding 
all products with a ProductID greater than or equal to 325 and then only returning those products where 
the ProductID is NOT less than 425. This yields a list of all products with a ProductID greater than or 
equal to 425.   
Lab 21: Inline Subquery Practice 
Question 1: 
SELECT * 
FROM Sales.SalesOrderDetail 
WHERE ProductID IN ( 
 SELECT ProductID 
 FROM Production.Product 
 WHERE Color = 'Black' 
 ) 
Question 2: 
SELECT * 
FROM HumanResources.Employee 
WHERE BusinessEntityID IN ( 
 SELECT E.BusinessEntityID 
 FROM HumanResources.Employee E 
 INNER JOIN Person.BusinessEntityAddress BEA 
 ON BEA.BusinessEntityID = E.BusinessEntityID 
 INNER JOIN Person.[Address] A 
 ON A.AddressID = BEA.AddressID 
 INNER JOIN Person.StateProvince SP 
 ON SP.StateProvinceID = A.StateProvinceID 
 WHERE SP.Name = 'California' 
 ) 
Question 3: 
SELECT * 
FROM Sales.SalesOrderHeader 
WHERE TotalDue < ( 
 SELECT AVG(TotalDue) 
 FROM Sales.SalesOrderHeader 
 ) 
 AND YEAR(OrderDate) = 2006 
Question 4: 
SELECT * 
FROM Production.Product 
WHERE ProductSubcategoryID IN ( 
 SELECT ProductSubcategoryID 
 FROM Production.ProductSubcategory 
 WHERE Name LIKE '%Bikes' 
 ) 

T-SQL Querying Guide  © The Knowlton Group, LLC  149 | P a g e    
Question 5: 
SELECT * 
FROM Sales.SalesOrderHeader 
WHERE OrderDate = ( 
 SELECT MAX(OrderDate)  
 FROM Sales.SalesOrderHeader 
 ) 
 AND TotalDue < ( 
    SELECT AVG(TotalDue)  
    FROM Sales.SalesOrderHeader 
 ) 
Lab 22: Using Correlated Subqueries 
Question 1: 
SELECT  
 SalesPersonID, SalesOrderNumber, TotalDue 
FROM Sales.SalesOrderHeader S1 
WHERE TotalDue = ( 
 SELECT MAX(TotalDue) 
 FROM Sales.SalesOrderHeader S2 
 WHERE S1.SalesPersonID = S2.SalesPersonID 
 ) 
Question 2: 
SELECT  
 BusinessEntityID,  
 Bonus 
FROM Sales.SalesPerson S1 
WHERE 0 = ( 
 SELECT Bonus 
 FROM Sales.SalesPerson S2 
 WHERE S1.BusinessEntityID = S2.BusinessEntityID 
 ) 
Question 3: 
SELECT * 
FROM Production.Product P1 
WHERE 539.99 = ( 
 SELECT ListPrice 
 FROM Production.Product P2 
 WHERE P1.ProductID = P2.ProductID 
 ) 
Lab 23: Advanced Aggregations and Pivoting 
Question 1: 
SELECT  
 DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE()), 0) AS FirstOfCurrentMonth,  
 DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE())-1, 0) AS FirstOfPriorMonth,  

T-SQL Querying Guide  © The Knowlton Group, LLC  150 | P a g e    
 DATEADD(DAY, -1, DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE())+1, 0)) AS 
LastDayOfCurrentMonth 
Question 2: 
WITH SalesDay 
AS ( 
 SELECT  
    SalesOrderID,  
    DATEDIFF(DAY, 0, OrderDate) AS OrderDay,  
    TotalDue 
 FROM Sales.SalesOrderHeader 
),  
SalesTotalByDay AS ( 
 SELECT 
    OrderDay,  
    SUM(TotalDue) AS TotalSales 
 FROM SalesDay 
 GROUP BY OrderDay 
) 
SELECT  
 CAST(DATEADD(DAY, OrderDay, 0) AS DATE) AS OrderDate,  
 TotalSales 
FROM SalesTotalByDay 
WHERE TotalSales = ( 
 SELECT MAX(TotalSales) 
 FROM SalesTotalByDay 
 ) 
Question 3: 
WITH TopHalf 
AS ( 
 SELECT TOP 50 PERCENT Points 
 FROM NHL_Standings_2014 
 ORDER BY Points DESC 
),  
BottomHalf AS ( 
 SELECT TOP 50 PERCENT Points 
 FROM NHL_Standings_2014 
 ORDER BY Points 
) 
SELECT  
 ((SELECT CAST(MAX(Points) AS NUMERIC) FROM BottomHalf) + 
 (SELECT CAST(MIN(Points) AS NUMERIC) FROM TopHalf)) / 2 
Question 4: 
WITH Products 
AS ( 
 SELECT  
    DATEDIFF(MONTH, 0, OrderDate) AS OrderMonth,  
    SOD.OrderQty 
 FROM Sales.SalesOrderHeader SOH 
 INNER JOIN Sales.SalesOrderDetail SOD 
 ON SOD.SalesOrderID = SOH.SalesOrderID 
), ProductsByMonth AS ( 

T-SQL Querying Guide  © The Knowlton Group, LLC  151 | P a g e    
 SELECT 
    OrderMonth, 
    SUM(OrderQty) AS ProductsOrdered 
 FROM Products 
 GROUP BY OrderMonth 
) 
SELECT  
 CAST(DATEADD(MONTH, P1.OrderMonth, 0) AS DATE) AS OrderMonth,  
 P1.ProductsOrdered AS ProductsOrderedCurrentMonth,  
 SUM(P2.ProductsOrdered) AS ProductsOrderedToDate 
FROM ProductsByMonth P1 
LEFT OUTER JOIN ProductsByMonth P2 
ON P2.OrderMonth <= P1.OrderMonth 
GROUP BY CAST(DATEADD(MONTH, P1.OrderMonth, 0) AS DATE), P1.ProductsOrdered 
ORDER BY 1 
Question 5: 
WITH Products 
AS ( 
 SELECT  
    DATEDIFF(MONTH, 0, OrderDate) AS OrderMonth,  
    SOD.OrderQty 
 FROM Sales.SalesOrderHeader SOH 
 INNER JOIN Sales.SalesOrderDetail SOD 
 ON SOD.SalesOrderID = SOH.SalesOrderID 
), ProductsByMonth AS ( 
 SELECT 
    OrderMonth, 
    SUM(OrderQty) AS ProductsOrdered 
 FROM Products 
 GROUP BY OrderMonth 
) 
SELECT  
 CAST(DATEADD(MONTH, P1.OrderMonth, 0) AS DATE) AS OrderMonth,  
 P1.ProductsOrdered AS ProductsOrderedCurrentMonth,  
 SUM(P2.ProductsOrdered) AS ProductsOrderedToDate 
FROM ProductsByMonth P1 
LEFT OUTER JOIN ProductsByMonth P2 
ON P2.OrderMonth <= P1.OrderMonth 
 AND YEAR(DATEADD(MONTH, P1.OrderMonth, 0)) = YEAR(DATEADD(MONTH, P2.OrderMonth, 
0)) 
GROUP BY CAST(DATEADD(MONTH, P1.OrderMonth, 0) AS DATE), P1.ProductsOrdered 
ORDER BY 1 
Lab 24: SQL Variables 
Question 1: 
DECLARE @StringExample VARCHAR(23) = 'This is a sample string' 
SELECT @StringExample 
Question 2: 
DECLARE @IntegerA INT = 1,  

T-SQL Querying Guide  © The Knowlton Group, LLC  152 | P a g e    
    @IntegerB INT = 4 
SELECT @IntegerA + @IntegerB 
Question 3: 
DECLARE @MaxOrderDate DATE = (SELECT MAX(OrderDate) FROM Sales.SalesOrderHeader) 
SELECT * 
FROM Sales.SalesOrderHeader 
WHERE OrderDate = @MaxOrderDate 
Question 4: 
DECLARE @SmallestSalePrice MONEY = (SELECT MIN(TotalDue) FROM Sales.SalesOrderHeader) 
SELECT * 
FROM Sales.SalesOrderHeader 
WHERE TotalDue = @SmallestSalePrice 
Lab 25: WHILE Loops 
Question 1: 
DECLARE @Count INT = 1 
WHILE @Count <= 5 
BEGIN 
 SELECT 'This is a T-SQL Guide' 
 SET @Count += 1 
END 
Question 2: 
CREATE TABLE #TempTable( 
 IntegerValue INT 
) 
GO 
DECLARE @Count INT = 1 
WHILE @Count <= 30 
BEGIN 
 INSERT INTO #TempTable(IntegerValue) 
 SELECT RAND()*40+1 
 SET @Count += 1 
END 
SELECT * 
FROM #TempTable 
DROP TABLE #TempTable 
Question 3: 
CREATE TABLE #TempTable( 
 IntegerValue INT 

T-SQL Querying Guide  © The Knowlton Group, LLC  153 | P a g e    
) 
GO 
DECLARE @Count INT = 1 
WHILE @Count <= 10 
BEGIN 
 INSERT INTO #TempTable(IntegerValue) 
 SELECT RAND()*10+1 
 SET @Count += 1 
END 
SELECT * 
FROM #TempTable 
DROP TABLE #TempTable 
Question 4: 
CREATE TABLE #WebTraffic( 
 UserID INT, 
 UserAccessCountID INT, 
 WebSiteName VARCHAR(50),   
 AccessDate DATETIME 
) 
GO  
DECLARE @UserID INT = 1,  
    @UserAccessCountID INT = 1,  
    @WebSiteID INT = 1 
WHILE @UserID <= 30 
BEGIN 
 WHILE @UserAccessCountID <= RAND()*10+1 
 BEGIN 
    SET @WebSiteID = RAND()*3+1 
    INSERT INTO #WebTraffic 
    SELECT 
   @UserID,  
   @UserAccessCountID,  
   CASE 
        WHEN @WebSiteID = 1 THEN 'www.espn.com' 
        WHEN @WebSiteID = 2 THEN 'www.google.com' 
        ELSE 'www.yahoo.com' 
   END,  
   DATEADD(SECOND, RAND()*86400+1, '1/14/2015') 
    SET @UserAccessCountID += 1 
 END 
 SET @UserAccessCountID = 1 
 SET @UserID += 1 
END 
SELECT * 
FROM #WebTraffic 

T-SQL Querying Guide  © The Knowlton Group, LLC  154 | P a g e    
DROP TABLE #WebTraffic