Richard's Way: Four Easy Steps To Create Complex SQL That Runs Fast
Recently I was faced with taking over the maintenance of a complex series of functions and stored procedures that produced student transcripts. What I found amazing was that the code had a massive amount of business rules that it needed to implement, yet it still ran incredibly fast. The creator of the code is my good friend Richard Waddell, and he took it for granted that the code would work so well. If I were faced with the same challenges he had, I doubt that I would have been able to succeed as well as he did. However, after studying his style and realizing that he basically follows 4 basic rules, I feel I can create any complex SQL that is easily maintainable and runs fast and efficiently.
Disclaimer, if you don’t agree with this style blame me don’t blame Richard. He is a humble guy who never claimed to be the last word in TSQL, he only showed me what works for him. If you do like this style however, he deserves all the credit.
The Four Rules
Decide how you need the final output to be (you will make a table variable and basically fill it. You do not have to fill it at one time. Perhaps only update certain columns or rows at a time)
Get all possible data up front (this reduces IO and is the biggest cost to speed)
Work from the bottom up (as you discover data you need, see #2)
Create 'while loops' as needed (the purpose of a while loop is to fill or update a table variable)
Let’s imagine that you are faced with the following task:
Allow a user to specify a month and a year, and show the growth of sales daily for all products for an entire month.
To complete this task, we must meet the following challenges:
Query the transaction table and retrieve the sales for each day
Add the total sales from the previous days
For this example we will use the Adventure Works database. You can download it from: http://msftdbprodsamples.codeplex.com/downloads/get/478218
Rule One: Decide How You Want The Final Output To Be
This is rule one, only because getting started can be the hardest thing to do. In our example we want to show a value for each day for each product. We will use a Table Value function with this structure:
CREATE FUNCTION InventoryByDay ( @ReportMonth int, @ReportYear int ) RETURNS @ReportResults TABLE ( ID int Identity(1,1) ,ProductID int NULL ,ProductName nvarchar(50) NOT NULL ,Day1 int NULL ,Day2 int NULL ,Day3 int NULL ,Day4 int NULL ,Day5 int NULL ,Day6 int NULL ,Day7 int NULL ,Day8 int NULL ,Day9 int NULL ,Day10 int NULL ,Day11 int NULL ,Day12 int NULL ,Day13 int NULL ,Day14 int NULL ,Day15 int NULL ,Day16 int NULL ,Day17 int NULL ,Day18 int NULL ,Day19 int NULL ,Day20 int NULL ,Day21 int NULL ,Day22 int NULL ,Day23 int NULL ,Day24 int NULL ,Day25 int NULL ,Day26 int NULL ,Day27 int NULL ,Day28 int NULL ,Day29 int NULL ,Day30 int NULL ,Day31 int NULL ,Day32 int NULL PRIMARY KEY (ID) -- This key helps performance ) AS BEGIN RETURN END
Rule Two: Get All Possible Data Up Front
In this example, we need to read through the Transaction table. We want to get all the sales Transactions up front and place into a table variable.
SQL Server documentation does not recommend creating table variables larger than 100 records, however in my practice it works fine as long as you always create a primary key or unique constraint.
However, we want to minimize the amount of information when possible. In this case we only want transactions for the month we are reporting on.
-- ============================================= -- Declare Table -- ============================================= DECLARE @TransactionHistory table ( TransactionID int NOT NULL, ProductID int NOT NULL, Quantity int NOT NULL, TransactionDate datetime NOT NULL --PRIMARY KEY (TransactionID) -- This key helps performance or UNIQUE CLUSTERED as used below UNIQUE CLUSTERED (TransactionID, ProductID, TransactionDate) ) -- ============================================= -- Get Raw Data -- ============================================= -- Get All Sales (S) Transactions Insert Into @TransactionHistory ( TransactionID, ProductID, Quantity, TransactionDate ) SELECT TransactionID, ProductID, Quantity, TransactionDate FROM Production.TransactionHistory WHERE TransactionDate >= @MonthStart AND TransactionType = 'S' AND TransactionDate < @FirstDayInNextMonth;
Rule Three: Work From The Bottom Up
At this point we want to concentrate on the final output and add things as needed. We know that the first thing we need is to output a row for each product. We add the following code:
-- Insert All Products into @ReportResults table Insert Into @ReportResults ( ProductID, ProductName ) SELECT ProductID, Name FROM Production.Product;
This produces the following output:
Rule Four: Create While Loops As Needed
This is the heart of the technique. Let us start with a simple loop:
It produces the following output:
In our example we add a inner loop that will get the inventory for the product row that we are currently on in the outer loop.
Only Day1 and Day2 is shown for sake of clarity, but in the final version (see download below), there is an “If Then” statement for each day:
-- ============================================= -- Loop through each day for each Product -- ============================================= WHILE @CurrentDay <= @DaysInMonth BEGIN -- Get the inventory for the day SET @LastInventoryAmount = @LastInventoryAmount + (SELECT ISNULL(SUM(Quantity),0) as TotalQuantity FROM @TransactionHistory where ProductID = @ProductID AND TransactionDate = @CurrentDateForInventory) -- Perform an update for the current day if @CurrentDay = 1 BEGIN UPDATE @ReportResults SET Day1 = @LastInventoryAmount WHERE ID = @RowIndex END Else if @CurrentDay = 2 BEGIN UPDATE @ReportResults SET Day2 = @LastInventoryAmount WHERE ID = @RowIndex END SET @CurrentDateForInventory = dateadd(d,(@CurrentDay),@MonthStart); -- Increase @CurrentDay to move to the next day Set @CurrentDay = @CurrentDay + 1 END -- Reset the values SET @CurrentDay = 1; SET @LastInventoryAmount = 0; SET @CurrentDateForInventory = @MonthStart;
When we run the function:
SELECT * FROM [dbo].[InventoryByDay] (5,2004)
Freedom From Fear Of T-SQL
The looping logic may seem simplistic and unsophisticated, but Richard has shown me that there are certain requirements that can only be addressed by using this method. If we had to implement business rules that adjusted the calculated values based on, for example, the day of the week, you would need to step through each record as this example does.
Maintainability is very important. Having a huge complex stored procedure that you can simply read from top to bottom allows you to easily make adjustments.
Most importantly, this method will handle any requirement, no matter how complex.
You can download the complete code on the Downloads page of this site.