A Division of Technology Associates International Corporation
Maximo Blog

Grouping Sets in SQL server 2008

September 23, 2008 in SQL Server by Michael Chrisman 1 Comment

SQL Server 2008 offers an interesting new functionality for Group By queries. It is called a Grouping Set. A Grouping Set allows you to merge multiple Group By queries into one compact query. Let's do an example.

Let's say you have a employee billing database that contains the billing history for your staff. You have a request to generate a report that will display the following:

Employee Yearly hours billed totals
Client Yearly hours billed totals
Customer-Employee Yearly hours billed totals
Total yearly hours billed

However, your boss want this all in one report. To do this in one result set you would have to use a UNION ALL query like this:

   1:  SELECT null as clientID, employeeID, YEAR(billDate) as billingYear, SUM(hours) as totalHours
   2:  FROM dbo.billing
   3:  GROUP BY employeeID, YEAR(billDate)
   4:   
   5:  UNION ALL
   6:   
   7:  SELECT clientID, null, YEAR(billDate) as billingYear, SUM(hours) as totalHours
   8:  FROM dbo.billing
   9:  GROUP BY clientID, YEAR(billDate)
  10:   
  11:  UNION ALL
  12:   
  13:  SELECT clientID, employeeID, YEAR(billDate) as billingYear, SUM(hours) as totalHours
  14:  FROM dbo.billing
  15:  GROUP BY clientID, employeeID, YEAR(billDate)
  16:   
  17:  UNION ALL
  18:   
  19:  SELECT null, null, YEAR(billDate) as billingYear, SUM(hours) as totalHours
  20:  FROM dbo.billing
  21:  GROUP BY YEAR(billDate);

Not only is this a long query, but we had to use null values to fill in columns at are not used in each query. Using Grouping Sets in SQL Server 2008, we could write the query list this:

   1:  SELECT clientID, employeeID, YEAR(billDate) as billingYear, SUM(hours) as totalHours
   2:  FROM dbo.billing
   3:  GROUP BY GROUPING SETS
   4:  (    ( employeeID, YEAR(billDate) ),
   5:       ( clientID, YEAR(billDate) ),
   6:       ( clientID, employeeID, YEAR(billDate) ),
   7:       ( YEAR(billDate) )
   8:  );

This new, shorter query would produce the same results. As it turns out, the Grouping Sets fun doesn't stop there. You can have multiple Grouping Sets. For example:

   1:  SELECT clientID, employeeID, MONTH(billDate) as billingMonth, YEAR(billDate) as billingYear, SUM(hours) as totalHours
   2:  FROM dbo.billing
   3:  GROUP BY GROUPING SETS
   4:  (    ( employeeID ),
   5:       ( clientID ),
   6:       ( clientID, employeeID ),
   7:       ()
   8:  )
   9:  GROUP BY GROUPING SETS
  10:  (
  11:       ( MONTH(billDate), YEAR(billDate) ),
  12:       ( YEAR(billDate) )
  13:  );

What SQL Server does when it see multiple Grouping Sets, is it does a Cartesian join of the items from both Grouping Sets. In the SQL statement above, it would be the same as:

   1:  SELECT clientID, employeeID, MONTH(billDate) as billingMonth, YEAR(billDate) as billingYear, SUM(hours) as totalHours
   2:  FROM dbo.billing
   3:  GROUP BY GROUPING SETS
   4:  (    ( employeeID, MONTH(billDate) , YEAR(billDate)),
   5:       ( clientID, MONTH(billDate) , YEAR(billDate) ),
   6:       ( clientID, employeeID, MONTH(billDate) , YEAR(billDate) ),
   7:       ( MONTH(billDate), YEAR(billDate) ),
   8:       ( employeeID), YEAR(billDate) ),
   9:       ( clientID, YEAR(billDate)  ),
  10:       ( clientID, employeeID, YEAR(billDate)  ),
  11:       ( YEAR(billDate) )
  12:  );

Using this Cartesian join property of Grouping Sets, we can re-write our original query to this:

   1:  SELECT clientID, employeeID, YEAR(billDate) as billingYear, SUM(hours) as totalHours
   2:  FROM dbo.billing
   3:  GROUP BY GROUPING SETS
   4:  (    ( employeeID ),
   5:       ( clientID ),
   6:       ( clientID, employeeID ),
   7:       ()
   8:  )
   9:  GROUP BY GROUPING SETS
  10:  (
  11:       ( YEAR(billDate) )
  12:  );

If you have to do a lot of Group By queries, this new functionality in SQL Server 2008 could make your SQL statement much simpler to read and write.

Comments

Yen Thu Le
Posted on April 17, 2009

Hi
It’s been 2 months that I’ve tried to find a way to make this expression builder in the PR workflow but it’s always return FALSE status. I have this SQL expression to check the cost over the approve limit of the supervisor of the requester on PR, if the cost less than the supervisor then send to the Purchase Agent Inbox if not then send to the supervisor Inbox but with this expression, it always send to the Purchase Agent Inbox. (FALSE)  Just want to know if there is something wrong with this SQL statement?
:TOTALCOST<= raspberryR_SUPERVISOR.PERSON.USER.GROUPUSER.MAXGROUP.PRLIMIT
Thanks,

Post a Comment

Remember my personal information.
Notify me of follow-up comments?

We don't know if you're a human. Confirm below: