Multiple Column Dynamic Sorting with T-SQL

I have done a bit of research to find the most efficient way to dynamically sort query results in Microsoft SQL Server 2008. I ran each query test twice, changing the “ORDER BY” clause. I would then check the query plan cache to see if it cached only once. Each time I ran “DBCC FREEPROCCACHE” before the queries to make it easier to weed out. I then ran the following statement to see the query plan caches:

SELECT UseCounts, Cacheobjtype, Objtype, TEXT, query_plan
FROM sys.dm_exec_cached_plans
CROSS APPLY sys.dm_exec_sql_text(plan_handle)
CROSS APPLY sys.dm_exec_query_plan(plan_handle)
WHERE text like '%FROM CategoriesV2%'

The following code caches both queries with separate Adhoc Complied Plans. This is the case because the two queries do not have the exact query string. Note we can sort using a column alias. This would be a case example if you would switch the ORDER BY statement dynamically using variables using an Application Server such as ColdFusion’s #sortByVar#.

Note in all the examples, I start the next statement directly after “GO” without an extra line. This is because the query plan caches the second query with a white space prefix if there where an extra line. Therefore, to eliminate the risk of this being a factor, I do not include an extra line to separate the two queries.

DECLARE @HAS_PRODUCTS BIT = 1

SELECT TOP 2 label AS COL1, categoryID AS COL2
FROM CategoriesV2
WHERE hasProducts = @HAS_PRODUCTS
ORDER BY COL2 DESC, COL1
GO
DECLARE @HAS_PRODUCTS BIT = 1

SELECT TOP 2 label AS COL1, categoryID AS COL2
FROM CategoriesV2
WHERE hasProducts = @HAS_PRODUCTS
ORDER BY NEWID()
GO

The following code also caches both queries with separate Adhoc Complied Plans. This is the case because the “CASE/THEN” statements make a dynamic where clause that is unable to be cached. Note I am sorting by actual column name and not the alias. For some reason, unknown to me, the alias can not be referenced as the sorted column.

DECLARE @SORT VARCHAR(30) = 'SORT1',
		@HAS_PRODUCTS BIT = 1

SELECT TOP 2 label AS COL1, categoryID AS COL2
FROM CategoriesV2
WHERE hasProducts = @HAS_PRODUCTS
ORDER BY
	CASE WHEN @SORT = 'SORT1' THEN label END,
	CASE WHEN @SORT = 'SORT1' THEN categoryID END DESC,
	CASE WHEN @SORT = 'SORT2' THEN categoryID END DESC,
	CASE WHEN @SORT = 'SORT2' THEN label END
GO
DECLARE @SORT VARCHAR(30) = 'SORT2',
		@HAS_PRODUCTS BIT = 1

SELECT TOP 2 label AS COL1, categoryID AS COL2
FROM CategoriesV2
WHERE hasProducts = @HAS_PRODUCTS
ORDER BY
	CASE WHEN @SORT = 'SORT1' THEN label END,
	CASE WHEN @SORT = 'SORT1' THEN categoryID END DESC,
	CASE WHEN @SORT = 'SORT2' THEN categoryID END DESC,
	CASE WHEN @SORT = 'SORT2' THEN label END
GO

In the previous code, I haven’t quite figured out what’s going on using the “CASE/WHEN/THEN” switches. All I know is it works. The closest thing I could think of was it uses a table, which if the “ASC” or “DESC” are appended to each column name, those are the column names and the column names used are populated as the column names in the sort table if it matches the “CASE WHEN” value. If it doesn’t match then the column value is NULL but the column still exists. I came to that perception by replacing “ORDER BY” with “SELECT” and running that clause as a statement by itself. If anyone has better insight on what’s going on here, please let me know.

The following code caches both queries with separate Prepared Complied Plans. This is the case because the two queries, though dynamically created, do not have the exact query string. Note we can sort using a column alias again.

DECLARE @SQL NVARCHAR(MAX),
@SORT VARCHAR(30) = 'COL2 DESC, COL1'

SELECT @SQL = '
SELECT TOP 2 label AS COL1, categoryID AS COL2
FROM CategoriesV2
WHERE hasProducts = @HAS_PRODUCTS
ORDER BY ' + @SORT

EXEC SP_ExecuteSQL @SQL, N'@HAS_PRODUCTS BIT', 1
GO
DECLARE @SQL NVARCHAR(MAX),
@SORT VARCHAR(30) = 'COL1, COL2 DESC'

SELECT @SQL = '
SELECT TOP 2 label AS COL1, categoryID AS COL2
FROM CategoriesV2
WHERE hasProducts = @HAS_PRODUCTS
ORDER BY ' + @SORT

EXEC SP_ExecuteSQL @SQL, N'@HAS_PRODUCTS BIT', 1
GO

In the above statement, if you where to try and inject the @SORT via the SP_ExecuteSQL params, you would receive the error “Msg 1008, Level 16, State 1, Line x: The SELECT item identified by the ORDER BY number 1 contains a variable as part of the expression identifying a column position. Variables are only allowed when ordering by an expression referencing a column name.”

Therefore, from research and testing, I have so far determined that when a dynamic column sort is necessary, a query plan may be cached for each distinct sort value.

If you have found a way to cache a query plan only once, and it is the correct way to accomplish that, please let me know.

Advertisements

#dynamic, #order-by, #sort, #sql, #t-sql

Removing Duplicates in MS SQL 2005+ Table w/o Adding an UID

Let’s say we have a table that links products to a category:

productID AS INT
categoryID AS INT

For some reason we managed to get dulicate productID’s and categoryID’s. I don’t really care to add a unique ID (UID) field to reference against using this script:

ALTER TABLE productCats
ADD RowID INT IDENTITY(1,1)
GO

DELETE productCats
FROM productCats
LEFT OUTER JOIN (
    SELECT MIN(RowID) AS RowID, productID, categoryID
    FROM productCats
    GROUP BY productID, categoryID
) AS KeepRows ON productCats.RowID = KeepRows.RowId
WHERE KeepRows.RowId IS NULL
GO

ALTER TABLE productCats
DROP COLUMN RowID
GO

Here we group together the duplicates into one row and assign it the minimum RowID for that group. If the left join returns a NULL when comparing all the RowID’s to the new set of groups then it will be deleted.

In MS SQL 2005+ I can use a Common Table Expression (CTE) to solve this issue:

WITH CTE
     AS (SELECT ROW_NUMBER() OVER (PARTITION BY productID, categoryID ORDER BY ( SELECT 0) ) RN
         FROM productCats)
DELETE FROM CTE
WHERE RN > 1

In this example, we give a row number (RN) to each group of duplicates and delete that row if greater than 1.

For more information on CTE’s see http://msdn.microsoft.com/en-us/library/ms190766(v=sql.105).aspx

#cte, #duplicates, #sql, #t-sql