Unfortunately the case in the data had “end dates” for when policies come up for renewal. The desired output was to have policies without valid end dates(NULL) to be last in the output. For this article I’ll use Microsoft’s sample database AdventWorks. The table we’ll use for our report is ProductCostHistory. The EndDate column has NULLs that we want to be listed last in our report.
If we simply query the data with an ORDER-BY we’ll get the following result:SELECT *
FROM [AdventureWorks].[Production].[ProductCostHistory]
ORDER BY [EndDate]
ProductID StartDate EndDate StandardCost ModifiedDate
----------- ----------------------- ----------------------- ------------------ -----------------------707 2003-07-01 00:00:00.000 NULL 13.0863 2003-06-17 00:00:00.000
708 2003-07-01 00:00:00.000 NULL 13.0863 2003-06-17 00:00:00.000
--Data edited out for clarity---
860 2002-07-01 00:00:00.000 2003-06-30 00:00:00.000 9.7136 2003-06-30 00:00:00.000
860 2003-07-01 00:00:00.000 NULL 9.1593 2003-06-17 00:00:00.000
As expected the NULL EndDate values come first in the list.
My report writer buddy came up some pretty clever work around involving temp tables and self-joins. Not very clear and poorly performing to say the least. So I showed him the following as the first replacement solution to placing NULLs at the end.SELECT *
FROM [AdventureWorks].[Production].[ProductCostHistory]ORDER BY isnull([EndDate],'12/31/9999')
ProductID StartDate EndDate StandardCost ModifiedDate
----------- ----------------------- ----------------------- ------------------ -----------------------
707 2001-07-01 00:00:00.000 2002-06-30 00:00:00.000 12.0278 2002-06-30 00:00:00.000
708 2001-07-01 00:00:00.000 2002-06-30 00:00:00.000 12.0278 2002-06-30 00:00:00.000
--Data edited out for clarity---
860 2002-07-01 00:00:00.000 2003-06-30 00:00:00.000 9.7136 2003-06-30 00:00:00.000
860 2003-07-01 00:00:00.000 NULL 9.1593 2003-06-17 00:00:00.000
Exactly what we wanted! BUT, this assumes that the date value ‘12/31/9999’ is beyond any other date in the table. Many times this solution is a safe assumption. This may NOT always be the case. I’ve seen databases where high values are sometimes used to indicate “out of range” or “present but invalid” etc.
The next solution addresses this condition by finding the MAX value in the column then adding an appropriate delta value to place the column after this maximum value. Here is the solution for our product cost history example:DECLARE @maxEndDate DATETIME
SELECT @maxEndDate = MAX(EndDate)+1
FROM [AdventureWorks].[Production].[ProductCostHistory]
SELECT *
FROM [AdventureWorks].[Production].[ProductCostHistory]
ORDER BY isnull([EndDate],@maxEndDate)
ProductID StartDate EndDate StandardCost ModifiedDate
----------- ----------------------- ----------------------- ------------------ -----------------------
707 2001-07-01 00:00:00.000 2002-06-30 00:00:00.000 12.0278 2002-06-30 00:00:00.000
708 2001-07-01 00:00:00.000 2002-06-30 00:00:00.000 12.0278 2002-06-30 00:00:00.000
--Data edited out for clarity---
860 2002-07-01 00:00:00.000 2003-06-30 00:00:00.000 9.7136 2003-06-30 00:00:00.000
860 2003-07-01 00:00:00.000 NULL 9.1593 2003-06-17 00:00:00.000
As expected the data is sorted in the correct order with the NULL values following the valid dates.
Now for what I think is a better solution and does not require hard coding a MAX(or Min) value or scanning through the data finding the last value. Here is my preferred solution to our date problem:SELECT *
FROM [AdventureWorks].[Production].[ProductCostHistory]
ORDER BY (case when EndDate IS NULL then 1 else 0 end),EndDate
ProductID StartDate EndDate StandardCost ModifiedDate
----------- ----------------------- ----------------------- ------------------ -----------------------
707 2001-07-01 00:00:00.000 2002-06-30 00:00:00.000 12.0278 2002-06-30 00:00:00.000
708 2001-07-01 00:00:00.000 2002-06-30 00:00:00.000 12.0278 2002-06-30 00:00:00.000
--Data edited out for clarity---
860 2002-07-01 00:00:00.000 2003-06-30 00:00:00.000 9.7136 2003-06-30 00:00:00.000
860 2003-07-01 00:00:00.000 NULL 9.1593 2003-06-17 00:00:00.000
Look at this for a moment. It is obvious but has a subtle requirement; the CASE statement must precede the column that contains NULL values. The benefit is that this will always work and requires no additional consideration for the range or contents of the column.
Conclusion: While this may be a simple little trick you’d be surprised how many report writers and data consumers struggle with exactly this problem. The CASE solution can also be expanded to provide table sort orders based on column ranges instead of simply NULL versus non-NULL.
-Dave Winters