Imagine we have following data in a table:
groupName volume class mark
---------- ------- ------ ----
group1 50 1 o
group1 50 1 o
group1 50 1 x
group1 25 2 o
group2 25 1 x
group2 17 3 x
group2 11 2 o
group3 11 1 o
group3 19 3 x
And there is the need to add totals row at the end (SUM
for volume and NULL
for the rest of the columns).
I know that what I need can be achieved by adding total with the union all
as:
select 0 as isTotal, groupName, class, mark, volume
from dataTable
union all
select 1, NULL, NULL, NULL, sum(volume)
from dataTable
order by isTotal, groupName, class
but this way table is scanned twice.
To avoid scanning data twice, I tried using group by
:
select grouping(groupName) as isTotal, groupName, class, mark, sum(volume) as volume
from dataTable
group by grouping sets ((), (groupName, class, mark, volume))
order by isTotal, groupName, class
this way there is only one table scan, and I'm getting almost what I need, except that two first rows of the sample data (which are duplicates) are merged into one, and I need duplicates be kept as separate rows.
The question: Is it possible to get table data with the totals row added so that table data is scanned once and duplicates are kept as a separate rows?
The desired result is what is returned by union all
query:
isTotal groupName class mark volume
------- ----------- ------ ---- -------
0 group 1 1 o 50
0 group 1 1 o 50
0 group 1 1 x 50
0 group 1 2 o 25
0 group 2 1 x 25
0 group 2 2 o 11
0 group 2 3 x 17
0 group 3 1 o 11
0 group 3 3 x 19
1 NULL NULL NULL 258
The result returned by group by grouping sets
query:
isTotal groupName class mark volume
------- ---------- ------ ---- -------
0 group 1 1 o 100
0 group 1 1 x 50
0 group 1 2 o 25
0 group 2 1 x 25
0 group 2 2 o 11
0 group 2 3 x 17
0 group 3 1 o 11
0 group 3 3 x 19
1 NULL NULL NULL 258
Even thought you have duplicated rows, you can make them unique and fix your issue. One way to do this is using ROW_NUMBER
function.
For example:
DECLARE @DataSource TABLE
(
[groupName] VARCHAR(6)
,[volume] TINYINT
,[class] TINYINT
,[mark] CHAR(1)
);
INSERT INTO @DataSource ([groupName], [volume], [class], [mark])
VALUES ('group1', '50', '1', 'x')
,('group1', '50', '1', 'x')
,('group1', '50', '1', 'o')
,('group1', '25', '2', 'o')
,('group2', '25', '1', 'x')
,('group2', '17', '3', 'x')
,('group2', '11', '2', 'o')
,('group3', '11', '1', 'o')
,('group3', '19', '3', 'x');
WITH DataSource ([rowID], [groupName], [volume], [class], [mark]) AS
(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT 1))
,[groupName]
,[volume]
,[class]
,[mark]
FROM @DataSource
)
SELECT GROUPING([groupName]) as [isTotal]
,[groupName]
,[class]
,[mark]
,SUM([volume]) AS [volume]
FROM DataSource
GROUP BY GROUPING SETS ((), ([rowID], [groupName], [volume], [class], [mark]))
ORDER BY [isTotal]
,[groupName]
,[class];
will give you:
exactly the same as your initial query:
select 0 as isTotal, groupName, class, mark, volume
from @DataSource
union all
select 1, NULL, NULL, NULL, sum(volume)
from @DataSource
order by isTotal, groupName, class
If you compare the execution plans you can see only one table scan is performed: