Consider the following SQL window query:
SELECT LAG(no) OVER (ROWS BETWEEN CURRENT ROW AND CURRENT ROW) FROM account;
I'd like to understand what the behaviour should be according to ANSI standard SQL.
In the context of the way I understand SQL windows [corroborated by https://learnsql.com/blog/sql-window-functions-cheat-sheet/ ] we have a partition and a window frame. Here the partition is the whole table and the window frame is just the current row for each row. I thought that what should happen in this query is that we should get a table with null for every row, because the LAG should be evaluated inside the window frame, where there is no previous row (always only 1 row).
Example:
lag |
---|
null |
null |
null |
null |
null |
null |
Running this in Postgres however, I get a null for the first value, and then the previous account number for every following row.
lag |
---|
null |
1 |
2 |
3 |
4 |
5 |
This suggests that the LAG was performed on the partition (all the data) and not the window frame (just the current row). In contrast if you use SUM() for your aggregate function it only computes it on the window frame as it does a running total.
E.g. with the SUM aggregate:
lag |
---|
1 |
3 |
6 |
10 |
15 |
21 |
So the question is:
Is there a property of LAG (and some other functions?) which means that it's computed on the partition and not the window, and is this in correspondence with the ANSI standard? Or is that Postgres decides to assume that I wanted to use the partition instead of the window frame in this particular context?
I looked for the answer in the Postgres Documentation or in the SQL in a Nutshell reference book but they don't seem to cover this particular case.
The LAG
function and some other functions operate on the entire partition rather than a frame. As per MySQL docs this is part of "Standard SQL":
Standard SQL specifies that window functions that operate on the entire partition should have no frame clause. MySQL permits a frame clause for such functions but ignores it. These functions use the entire partition even if a frame is specified
You can also notice in PostgreSQL's documentation for the Lag
function that they explicitly mention (emphasis mine):
Returns value evaluated at the row that is offset rows after the current row within the partition