Blog Post

T-SQL Window Functions – Part 2: Ranking Functions

,

This is a reprint with some revisions of a series I originally published on LessThanDot. You can find the links to the original blogs on my Series page.

TSQL WIndow Functions_thumb[1]This is part 2 in my series on SQL window functions. In this post, we will explore using ranking functions. SQL Server support four different ranking functions which are supported in SQL Server versions 2005 and forward. All of these functions require the use of the OVER clause. The following functions are classified as ranking functions: ROW_NUMBER, RANK, DENSE_RANK, NTILE.

Once again, the following CTE will be used as the query in all examples throughout the post:

with CTEOrders as

(select cast(1 as int) as OrderID, cast(’3/1/2012′ as date) as OrderDate, cast(10.00 as money) as OrderAmt, ‘Joe’ as CustomerName

union select 2, ’3/1/2012′, 11.00, ‘Sam’

union select 3, ’3/2/2012′, 10.00, ‘Beth’

union select 4, ’3/2/2012′, 15.00, ‘Joe’

union select 5, ’3/2/2012′, 17.00, ‘Sam’

union select 6, ’3/3/2012′, 12.00, ‘Joe’

union select 7, ’3/4/2012′, 10.00, ‘Beth’

union select 8, ’3/4/2012′, 18.00, ‘Sam’

union select 9, ’3/4/2012′, 12.00, ‘Joe’

union select 10, ’3/4/2012′, 11.00, ‘Beth’

union select 11, ’3/5/2012′, 14.00, ‘Sam’

union select 12, ’3/6/2012′, 17.00, ‘Beth’

union select 13, ’3/6/2012′, 19.00, ‘Joe’

union select 14, ’3/7/2012′, 13.00, ‘Beth’

union select 15, ’3/7/2012′, 16.00, ‘Sam’

)

select OrderID

,OrderDate

,OrderAmt

,CustomerName

from CTEOrders;

ROW_NUMBER

The ROW_NUMBER function will return a row number for each row within the partition based on the partition and order. This function requires the use of the ORDER BY clause. However, it is often used without the PARTITION BY clause as it will number the entire result set. If PARTITION BY is used, then the row numbering starts over within the partition. The following code shows how both of these work.

select CustomerName

, OrderDate

, OrderAmt

,ROW_NUMBER() OVER(ORDER BY CustomerName) RowNumByCust

,ROW_NUMBER() OVER(PARTITION BY OrderDate ORDER BY CustomerName) RowNumPart

from CTEOrders

order by CustomerName;

RANK and DENSE_RANK

While these are different functions with even different rules, it is easier to understand the difference when put side by side. RANK and DENSE_RANK will order the rows based on the specified partition and apply a rank or number to them. Both RANK and DENSE_RANK will assign the same rank to “ties”. For example if rows 3 and 4 have the same value in the partition, they will have the same rank. The difference is how it handles the next rank number in the series. RANK does a “true” ordering and will apply the ranking based on the number of rows and skip numbers that are ties. So, if you have a tie between the third and fourth row and the first two rows and the final row are unique the ranking is as follows: 1, 2, 3, 3, 5. As you can see, 4 is missing. DENSE_RANK keeps the tie as well, however it does not skip any numbers in the sequence. Here is the same example set based on using DENSE_RANK: 1, 2, 3, 3, 4. As with the ROW_NUMBER function, the ORDER BY is required for these functions.

select CustomerName

, OrderDate

, OrderAmt

,RANK() OVER (ORDER BY CustomerName) RankByCust

,DENSE_RANK() OVER (ORDER BY CustomerName) DenseByCust

,RANK() OVER (PARTITION BY CustomerName ORDER BY OrderDate) RankByCustDt

,DENSE_RANK() OVER (PARTITION BY CustomerName ORDER BY OrderDate) DenseByCustDt
from CTEOrders

order by CustomerName;

NTILE

The last of the ranking functions is NTILE. NTILE groups the data into ordered and ranked groups based on the ORDER BY clause. The number of groups used in the ranking are specified in the function itself. So if you specify four groups to produce a quartile ranking, four ranked values, one through four, will be assigned to each group based on the order. In many cases, the total numbers of rows is not divisible by the number of groups chosen. For instance, if the number of groups is 4 but the total number of rows is 39, then the first three groups will return 10 rows and the final group will only return 9 rows. The function will always frontload the results so the earliest groups will have the “extra” rows. If a PARTITION BY clause is used, the NTILE ranking will applied within each partition. As with the other ranking functions, the ORDER BY clause is required to use this function.

select CustomerName

,OrderDate

,OrderAmt

,NTILE(4) OVER (ORDER BY OrderDate) NtileByDt

,NTILE(2) OVER (PARTITION BY OrderDate ORDER BY  OrderAMt) NtileByDtAmt
from CTEOrders

order by CustomerName;

Up next, using aggregate functions with window functions.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating