sqlamazon-athenapresto

How to lag using dynamic offset in SQL AWS Athena?


I have a table with a column of numbers and then other columns with other data. I have created column "floor10" that has floor of those numbers as a multiple of 10 (e.g. 8 -> 0, 17 -> 10, etc.). What I'd like to do is create another column that has the lag of "floor10" and shows it for each row. There's no fixed number of rows per distinct value of "floor10" so I'm not sure how I would do this using LAG in SQL on AWS Athena.

For example,

Num floor10 Name
2 0 James
23 20 James
28 20 James
16 10 James
38 30 James
8 0 John
54 50 John
56 50 John
28 20 John
22 20 John
25 20 John

I'd like something like,

Num floor10 Name floor10_prev
2 0 James
16 10 James 0
23 20 James 10
28 20 James 10
38 30 James 20
8 0 John
22 20 John 0
25 20 John 0
28 20 John 0
54 50 John 20
56 50 John 20

The following did not work:

SELECT
Num,
floor10,
Name,
LAG(floor10) OVER (PARTITION BY Name, Num ORDER BY Name, Num) AS floor10_prev
FROM table

Alternatively, I tried making a subquery with distinct "floor10" and "Name" and then use lag on that before joining on the main table. While this got the desired result, this doesn't seem efficient and doing that on a much larger dataset took a long time to complete.


Solution

  • You can use dynamically calculate shift value for lag() - that is row number in group of rows with (name,num). See example

    select *
      ,lag(floor10,rn)over(partition by name order by floor10) prev_fllor10
    from(
       select *
         ,row_number()over(partition by name,floor10 order by floor10)rn
       from test
    )a
    
    Num floor10 Name rn prev_fllor10
    2 0 James 1 null
    16 10 James 1 0
    23 20 James 1 10
    28 20 James 2 10
    38 30 James 1 20
    8 0 John 1 null
    22 20 John 1 0
    25 20 John 2 0
    28 20 John 3 0
    54 50 John 1 20
    56 50 John 2 20

    fiddle

    Test data

    Num floor10 Name
    2 0 James
    23 20 James
    28 20 James
    16 10 James
    38 30 James
    8 0 John
    54 50 John
    56 50 John
    28 20 John
    22 20 John
    25 20 John