snowflake-cloud-data-platformbi-temporalpoint-in-time

Snowflake Query for Latest Snapshot From Bitemporal Data


Given a table of data with bitemporal modeling where there are 2 dates: (i) the date that the data applies to, and (ii) the datetime at which the fact is known

City   Temp  Date        As_of_Datetime
———    ———-  ———-        -——————-
Boston 32    2022/07/01  2022/06/28 13:23:00
Boston 31    2022/07/01  2022/06/29 13:23:00
Miami  74    2022/07/01  2022/06/28 13:23:00
Miami  75    2022/07/01  2022/06/29 13:23:00

What snowflake query will give the latest snapshot of the data for each date based on the most recent As_of_Datetime?

The expected result would be

City   Temp    Date
Boston 31      2022/07/01
Miami  75      2022/07/01

I tried using the last_value function

select City, Date, last_value(Temp) over (partition by City, Date order by As_of_Datetime) as Temp
from temperature_table
order by City, Date

but that produced duplicate rows where the same last value is repeated:

Boston   31    2022/07/01
Boston   31    2022/07/01
Miami    75    2022/07/01
Miami    75    2022/07/01

Ideally there should only be 1 row returned for each (City, Date) combo.

Thank you in advance for your consideration and response.


Solution

  • It could be achieved by using QUALIFY and ROW_NUMBER - partitioned by City, Date and sorted As_of_DateTime descending:

    SELECT *
    FROM tab
    QUALIFY ROW_NUMBER() OVER(PARTITION BY City, Date ORDER BY As_of_DateTime DESC) = 1