sqlpartitioningsnowflake-schema

How do I sum the number of order quantities for each part number from the current date to the current date - lead time?


I am trying to write a query to sum the amount of orders for each part number between two dates (order date and lookback date (order date - lead time)).

For example,

Part Num Order Date Lookback Date Lead Time Order Qty Total Orders
A 01-20-2023 01-10-2023 10 2 3
A 01-15-2023 01-05-2023 10 1 1

I essentially am looking to calculate the "Total Orders" column....

This is the query I currently have and I keep getting an error for using "Lead Time" after the between function.

 select 
    "Part Number",
    "Snapshot Date",
    sum("Lead Time"),
    sum("Order Qty") over (partition by "Part Number" order by 
    "Snapshot Date" desc rows between "Lead Time" preceding and current 
    row) as "Total Demand"
  from table1

My source data is structured as :

Snapshot Date Part Number Lead Time Order Qty
01-01-2022 A 15 0
01-01-2022 B 10 1
01-02-2022 A 15 1
01-02-2022 B 10 0
01-03-2022 A 15 5
01-03-2022 B 10 3

Solution

  • I see what you are trying to do with the windowed function. Unfortunately you can't specify a dynamic range of rows for each window to look at. Also the ROWS specifies a number of rows and you want a number of days (relative to your Order/Snapshot date (Seems these mean the same thing. I have used SnapshotDate from here on out)).

    The trick here is to calculate the LookbackDate, and then match all the relevant SnapshotDates that fall within a LookbackDate and a SnapshotDate. Then you can aggregate to get the total. So in essence a 'self join' is required.

    My below solution is in T-SQL. But everything except the temp table syntax should work in Snowflake.

    --Create some data to work with
    DROP TABLE IF EXISTS #InputData;
    CREATE TABLE #InputData(SnapshotDate DATE NOT NULL
                        , PartNum CHAR(1) NOT NULL
                        , LeadTime BIGINT NOT NULL
                        , OrderQty BIGINT NOT NULL
                        )
    INSERT INTO #InputData(SnapshotDate
                        , PartNum
                        , LeadTime
                        , OrderQty
                        )
    SELECT CONVERT(DATE, X.SnapshotDate, 105)
        , X.PartNum
        , X.LeadTime
        , X.OrderQty
    FROM (VALUES
                ('01-01-2022', 'A', 15, 0)
                , ('01-01-2022', 'B', 10, 1)
                , ('01-02-2022', 'A', 15, 1)
                , ('01-02-2022', 'B', 10, 0)
                , ('01-03-2022', 'A', 15, 5)
                , ('01-03-2022', 'B', 10, 3)
                
        ) X(SnapshotDate
            , PartNum
            , LeadTime
            , OrderQty
            );
    
        
    /*
    IMPORTANT ASSUMPTION: The combination of SnapshotDate and PartNum is unique.
    */      
    
    --4. Finally do a 1 to 1 Join back onto your orginal dataset, to include the total and the lookback date.
    SELECT inp.PartNum
        , inp.SnapshotDate
        , subq2.LookbackDate
        , inp.LeadTime
        , inp.OrderQty
        , subq2.OrderQtyInLastLeadManyDays
    FROM #InputData inp
    INNER JOIN (
        --3. Sum the order quantity for each snapshot date, lookback date and part num.
        SELECT subq.SnapshotDate
            , subq.LookbackDate
            , subq.PartNum
            , SUM(inp.OrderQty) AS OrderQtyInLastLeadManyDays
        FROM #InputData inp
        INNER JOIN(
            --1. lets create the lookback date from the SnapshotDate and the LeadTime
            SELECT SnapshotDate
                , DATEADD(DAY, -LeadTime, SnapshotDate) AS LookbackDate
                , PartNum
                , OrderQty
            FROM #InputData
            ) subq
            --2.Join the data set onto itself, where the SnapshotDate is within the date range (LookbackDate, SnapshotDate). Also make sure PartNum is the same...
            --I've chosen not to be inclusive on both ends (like a BETWEEN), as it didn't feel right. But you can easily change this if you wish.
            ON inp.SnapshotDate > subq.LookbackDate AND inp.SnapshotDate <= subq.SnapshotDate
                AND inp.PartNum = subq.PartNum
        GROUP BY subq.SnapshotDate
            , subq.LookbackDate
            , subq.PartNum
        ) subq2
        ON inp.SnapshotDate = subq2.SnapshotDate
            AND inp.PartNum = subq2.PartNum
    ORDER BY inp.SnapshotDate ASC
        , inp.PartNum ASC;
    

    Try it yourself on db<>fiddle. You are probably going to want to change the input data set. As it currently stands there is nothing 'interesting' going on. All your SnapshotDates (for the same PartNum) are about 30 days apart. However, your lead times are less than 30 days. Play around with the values and see if this answers your question.