I want to calculate a 10 day rolling sum. The problem is that the time is not regular, eg sometimes we have a couple of observation on the same date and then some dates are missing in between.
Mock data looks as following:
order_id | order_completed_at | order_amount | customer_id | date_diff_10_days | sum_10_days |
---|---|---|---|---|---|
ord_1 | 2024-05-01 | 1 | aad_1 | 2024-04-21 | 1 |
ord_2 | 2024-05-01 | 5 | aad_1 | 2024-04-21 | 6 |
ord_3 | 2024-05-05 | 10 | aad_1 | 2024-04-25 | 16 |
ord_4 | 2024-05-15 | 15 | aad_1 | 2024-05-05 | 25 |
My idea was to use a window function, together with sum(if(date_diff_10_days <= order_completed_at , order_amount,0 ))
select
* ,
sum(if(
date_diff_10_days <= order_completed_at
, order_amount, 0)) as sum_10_days
OVER (PARTITION BY customer_id order by order_completed_at asc
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
from mock_data
The query returns a cumulative sum over all records and not only of records within a 10 days time window. Does somebody have idea how I can fix my query?
I created a sample query order_id
, order_completed_at
, order_amount
and calculated the rolling sum for last 10 days.I dont have Big Query, so unable to test. I have used postgres as an example.
I have used additional dates to test scenarios in the fiddle, let me know
WITH rolling_sum AS (
SELECT
o1.order_id,
o1.order_completed_at,
o1.order_amount,
o1.customer_id,
SUM(o2.order_amount) AS sum_10_days
FROM orders o1
JOIN orders o2
ON o1.customer_id = o2.customer_id
AND o2.order_completed_at >= o1.order_completed_at - INTERVAL '10 days'
AND o2.order_completed_at <= o1.order_completed_at
GROUP BY
o1.order_id,
o1.order_completed_at,
o1.order_amount,
o1.customer_id
)
SELECT * FROM rolling_sum
ORDER BY order_id;
Output