sql-servert-sql

The field in ORDER BY affects the result of window functions


I have simple T-SQL query, which calculates row number, rows count and total volume across all records:

DECLARE @t TABLE
(
    id        varchar(100),
    volume    float,
    prev_date date
);

INSERT INTO @t VALUES
('0318610084', 100, '2019-05-16'),
('0318610084', 200, '2016-06-04');

SELECT
    row_num    = ROW_NUMBER() OVER (PARTITION BY id ORDER BY prev_date),
    rows_count = COUNT(*) OVER (PARTITION BY id ORDER BY prev_date),
    vol_total  = SUM(volume) OVER (PARTITION BY  id ORDER BY prev_date),
    *
FROM @t;

I get the following result:

IMG1

However, this is NOT what I expected: in all two rows the rows_count must be 2 and vol_total must be 300:

IMG2

The workaround would be to add ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. However, I thought that there must be another way.

In the end of the day I have found out that the ORDER BY clause must use id field rather prev_date field:

row_num    = ROW_NUMBER() OVER (PARTITION BY id ORDER BY id),
rows_count = COUNT(*) OVER (PARTITION BY id ORDER BY id),
vol_total  = SUM(volume) OVER (PARTITION BY id ORDER BY id)

After this change the query's output is as expected. But! I don't understand why is this so? How come the ordering affects partitioning?


Solution

  • For Aggregate functions generally it is not required to have order in the window definition unless you want to do the aggregation one at a time in an ordered fashion, it is like running total. Simply removing the orders will fix the problem.

    If I want to explain it from another way it would be like a window that is expanding row by row as you move on to another row. It is started with the first row, calculate the aggregation with all the rows from before (which in the first row is just the current row!) to the position of row.

    if you remove the order, the aggregation will be computed for all the rows in the window definition and no order of applying window will take effect.

    You can change the order in window definition to see the effect of it.

    Of course, ranking functions need the order and this point is just for the aggregations.

    DECLARE @t TABLE
    (
        id        varchar(100),
        volume    float,
        prev_date date
    );
    
    INSERT INTO @t VALUES
    ('0318610084', 100, '2019-05-16'),
    ('0318610084', 200, '2016-06-04');
    
    SELECT
       row_num    = ROW_NUMBER() OVER (PARTITION BY id ORDER BY prev_date),
       rows_count = COUNT(*) OVER (PARTITION BY id),
       vol_total  = SUM(volume) OVER (PARTITION BY  id),
       *
    FROM @t;
    

    Enabling order in the window for aggregations added after SqlServer 2012 and it was not part of the first release of the feature in 2005.

    For a detailed explanation of the order in window functions on aggregates this is a great help: Producing a moving average and cumulative total - SqlServer Documentation