I am trying to write a query to sum the amount of orders for each part number between two dates (order date and lookback date (order date - lead time)).
For example,
Part Num | Order Date | Lookback Date | Lead Time | Order Qty | Total Orders |
---|---|---|---|---|---|
A | 01-20-2023 | 01-10-2023 | 10 | 2 | 3 |
A | 01-15-2023 | 01-05-2023 | 10 | 1 | 1 |
I essentially am looking to calculate the "Total Orders" column....
This is the query I currently have and I keep getting an error for using "Lead Time" after the between function.
select
"Part Number",
"Snapshot Date",
sum("Lead Time"),
sum("Order Qty") over (partition by "Part Number" order by
"Snapshot Date" desc rows between "Lead Time" preceding and current
row) as "Total Demand"
from table1
My source data is structured as :
Snapshot Date | Part Number | Lead Time | Order Qty |
---|---|---|---|
01-01-2022 | A | 15 | 0 |
01-01-2022 | B | 10 | 1 |
01-02-2022 | A | 15 | 1 |
01-02-2022 | B | 10 | 0 |
01-03-2022 | A | 15 | 5 |
01-03-2022 | B | 10 | 3 |
I see what you are trying to do with the windowed function. Unfortunately you can't specify a dynamic range of rows for each window to look at. Also the ROWS specifies a number of rows and you want a number of days (relative to your Order/Snapshot date (Seems these mean the same thing. I have used SnapshotDate from here on out)).
The trick here is to calculate the LookbackDate, and then match all the relevant SnapshotDates that fall within a LookbackDate and a SnapshotDate. Then you can aggregate to get the total. So in essence a 'self join' is required.
My below solution is in T-SQL. But everything except the temp table syntax should work in Snowflake.
--Create some data to work with
DROP TABLE IF EXISTS #InputData;
CREATE TABLE #InputData(SnapshotDate DATE NOT NULL
, PartNum CHAR(1) NOT NULL
, LeadTime BIGINT NOT NULL
, OrderQty BIGINT NOT NULL
)
INSERT INTO #InputData(SnapshotDate
, PartNum
, LeadTime
, OrderQty
)
SELECT CONVERT(DATE, X.SnapshotDate, 105)
, X.PartNum
, X.LeadTime
, X.OrderQty
FROM (VALUES
('01-01-2022', 'A', 15, 0)
, ('01-01-2022', 'B', 10, 1)
, ('01-02-2022', 'A', 15, 1)
, ('01-02-2022', 'B', 10, 0)
, ('01-03-2022', 'A', 15, 5)
, ('01-03-2022', 'B', 10, 3)
) X(SnapshotDate
, PartNum
, LeadTime
, OrderQty
);
/*
IMPORTANT ASSUMPTION: The combination of SnapshotDate and PartNum is unique.
*/
--4. Finally do a 1 to 1 Join back onto your orginal dataset, to include the total and the lookback date.
SELECT inp.PartNum
, inp.SnapshotDate
, subq2.LookbackDate
, inp.LeadTime
, inp.OrderQty
, subq2.OrderQtyInLastLeadManyDays
FROM #InputData inp
INNER JOIN (
--3. Sum the order quantity for each snapshot date, lookback date and part num.
SELECT subq.SnapshotDate
, subq.LookbackDate
, subq.PartNum
, SUM(inp.OrderQty) AS OrderQtyInLastLeadManyDays
FROM #InputData inp
INNER JOIN(
--1. lets create the lookback date from the SnapshotDate and the LeadTime
SELECT SnapshotDate
, DATEADD(DAY, -LeadTime, SnapshotDate) AS LookbackDate
, PartNum
, OrderQty
FROM #InputData
) subq
--2.Join the data set onto itself, where the SnapshotDate is within the date range (LookbackDate, SnapshotDate). Also make sure PartNum is the same...
--I've chosen not to be inclusive on both ends (like a BETWEEN), as it didn't feel right. But you can easily change this if you wish.
ON inp.SnapshotDate > subq.LookbackDate AND inp.SnapshotDate <= subq.SnapshotDate
AND inp.PartNum = subq.PartNum
GROUP BY subq.SnapshotDate
, subq.LookbackDate
, subq.PartNum
) subq2
ON inp.SnapshotDate = subq2.SnapshotDate
AND inp.PartNum = subq2.PartNum
ORDER BY inp.SnapshotDate ASC
, inp.PartNum ASC;
Try it yourself on db<>fiddle. You are probably going to want to change the input data set. As it currently stands there is nothing 'interesting' going on. All your SnapshotDates (for the same PartNum) are about 30 days apart. However, your lead times are less than 30 days. Play around with the values and see if this answers your question.