sqlsnowflake-cloud-data-platform

SQL-Snowflake - How to count the days between dates in 2 or more rows with WHERE condition


COUNTRY STORE_ID ORDER_DATE
DE 990003975 2023-01-24
FR 990004963 2023-04-11
FR 990005204 2023-06-15
FR 990005204 2023-06-10
FR 990005204 2023-06-07
JP 990005210 2023-01-08

I have data presented like the table above. I need a way to add a column that shows me the difference in days between the dates of the same STORE_ID, ordering them in DESC order, I need the difference between the top date, and the previous one. Repeat for other dates of the same STORE_ID if they exist. If there are no more dates, or just 1, put a 0. So, for this example, I need to get something like this:

COUNTRY STORE_ID ORDER_DATE DAYS
DE 990003975 2023-01-24 0
FR 990004963 2023-04-11 0
FR 990005204 2023-06-15 5
FR 990005204 2023-06-10 3
FR 990005204 2023-06-07 0
JP 990005210 2023-01-08 0

I tried something like this that I saw in another question, but not sure why it's not working for me (I get nulls in the new column)

SELECT  country,
        store_id,
        order_date,
        NextDate,
        DATEDIFF(day, order_date, NextDate)
FROM    (   SELECT  country,
                    store_id,
                    order_date,
                    (   SELECT  MIN(order_date) 
                        FROM    main T2
                        WHERE   T2.store_id = T1.store_id
                        AND     T2.country > T1.country
                        AND     T2.order_date > T1.order_date
                    ) AS NextDate
            FROM    main T1
        ) AS T
order by 2,3

Solution

  • So I swapped to LEAD over LAG, and your desciption of time descending, changes the relationship of what the "other wanted value" is:

    with data(country, store_id, order_date) as (
        select * from values
            ('DE', 990003975, '2023-01-24'::date),
            ('FR', 990004963, '2023-04-11'::date),
            ('FR', 990005204, '2023-06-15'::date),
            ('FR', 990005204, '2023-06-10'::date),
            ('FR', 990005204, '2023-06-07'::date),
            ('JP', 990005210, '2023-01-08'::date)
    )
    select 
        *
        ,lead(order_date,1,order_date) over (partition by store_id order by order_date desc) as lead_order_date
        ,datediff('days', lead_order_date, order_date) as days
    from data
    order by 2,3 desc;
    

    gives:

    enter image description here

    which can be compressed like:

    select 
        *
        ,datediff(
            'days', 
            lead(order_date,1,order_date) over (partition by store_id order by order_date desc), 
            order_date
        ) as days
    from data
    order by 2,3 desc;
    

    giving a cleaner answer:

    enter image description here