Blog Post

Grouping Sets

,

I have given a talk about new data types and T-SQL features to three local groups.  I am presenting it again at the St. Louis Day of .Net on December 13th (http://stlouisdayofdotnet.com/).  Since Itzik Ben-Gan was giving a similar talk at PASS last week, I decided to attend his talk to get some additional insight on the topic.

Itzik’s favorite new T-SQL feature is the grouping sets. Before his talk, I didn’t really see any benefit over using UNION queries (examples below) which will give the identical results.  Itzik said that the new grouping sets performed better than UNION queries. When I had experimented with the new feature, I had found identical performance.  Rather than embarrass myself by questioning the T-SQL guru, I made a note to revisit this later.

Today I decided to play with the new feature to see for myself.  What I found was that if no WHERE clause was used, the performance was the same between the two techniques. The execution plans were almost identical with both queries performing two clustered index scans.  Adding a WHERE clause with a filter on the clustered indexed column changed things considerably, and now the grouping set query performed better. It had one clustered index seek while the UNION query had two.  I found that adding any WHERE clause, even one that resulted in a scan, improved the performance of the grouping set query.  

My examples had only had two grouping levels, and I suspected that adding more grouping levels would show an even more dramatic difference in performance. My hunch was correct. The performance was always better with the grouping set query as long as a WHERE clause was included.  The UNION queries had to access the table once for each level while the grouping set query would access the table just once.

Here are the queries I experimented with using the SQL Server 2008 AdventureWorks sample database.

TEST ONE, identical performance, each has a clustered index scan:

select salesorderid,SUM(unitPrice),ProductID

from Sales.SalesOrderDetail

group by grouping sets(SalesOrderID,ProductID)

go

select null as salesorderid,SUM(unitprice),productid

from Sales.SalesOrderDetail

group by ProductID

union

select salesorderid,SUM(unitprice),null

from Sales.SalesOrderDetail

group by SalesOrderID

-----------------------------------------------------------------------

TEST Two: Add a WHERE clause. Now the grouping set query has one clustered index seek while the UNION query has two:

select salesorderid,SUM(unitPrice),ProductID
from Sales.SalesOrderDetail
where SalesOrderID between 40000 and 50000
group by grouping sets(SalesOrderID,ProductID)
go
select null as salesorderid,SUM(unitprice),productid
from Sales.SalesOrderDetail
where SalesOrderID between 40000 and 50000
group by ProductID
union
select salesorderid,SUM(unitprice),null
from Sales.SalesOrderDetail
where SalesOrderID between 40000 and 50000
group by SalesOrderID 
-------------------------------------------------------------------------------

TEST Three: Better performance even when filtering on a non-indexed column. Still one clustered index scan instead of two

select salesorderid,SUM(unitPrice),ProductID
from Sales.SalesOrderDetail
where UnitPrice between 10 and 20
group by grouping sets(SalesOrderID,ProductID)
go
select null as salesorderid,SUM(unitprice),productid
from Sales.SalesOrderDetail
where UnitPrice between 10 and 20
group by ProductID
union
select salesorderid,SUM(unitprice),null
from Sales.SalesOrderDetail
where UnitPrice between 10 and 20
group by SalesOrderID
-----------------------------------------------------------------------------

TEST Four: The performance difference increases with each grouping level added.

select salesorderid,SUM(unitPrice),ProductID,SpecialOfferID
from Sales.SalesOrderDetail
where SalesOrderID between 40000 and 50000
group by grouping sets(SalesOrderID,ProductID,SpecialOfferID)
go
select null as salesorderid,SUM(unitprice),productid, null as SpecialOfferID
from Sales.SalesOrderDetail
where SalesOrderID between 40000 and 50000
group by ProductID
union
select salesorderid,SUM(unitprice),null,null
from Sales.SalesOrderDetail
where SalesOrderID between 40000 and 50000
group by SalesOrderID
Union
select null,SUM(unitprice),null,SpecialOfferID
from Sales.SalesOrderDetail
where SalesOrderID between 40000 and 50000
group by SpecialOfferID
 
 

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating