Once in a while I have the use case of doing a SAMPLE BY
with a FILL
to interpolate empty rows for each interval, but with fixed boundaries, for example for 7 days from now. I know this has been asked in the past, but the result query I have is error prone.
In my query below I want to get hourly totals per merchant_id and for the past 7 days. I want to have a dense result with an hour per merchant_id even if I have no activity in that hour.
I need to use the merchant_id, start, and end dates in different parts of the queries, at the two UNIONS
around the main query and also in the WHERE
condition of my main query. It is not the first time I remember to change in one place but not in the other, and results are wrong.
I was wondering if there is some way to re-write this query so I need to enter the start and end timestamps just once to make it more robust.
WITH bounded AS ( (
SELECT date_trunc('day', dateadd('d', -31, now())) as timestamp, 1 as merchant_id, 0 as deposit, 0 as withdraw
UNION ALL
SELECT timestamp, merchant_id,
CASE WHEN transaction_type = 'deposit' then 1 ELSE 0 END AS deposit,
CASE WHEN transaction_type = 'withdraw' then 1 ELSE 0 END AS withdraw
FROM transactions
WHERE merchant_id = 1 AND status = 'success'
AND timestamp between date_trunc('day', dateadd('d', -31, now())) AND now()
UNION ALL
SELECT now() as timestamp, 1 as merchant_id, 0 as deposit, 0 as withdraw
) timestamp(timestamp) )
SELECT
timestamp,
merchant_id,
SUM(deposit) AS transaction_deposit_count,
SUM(withdraw) AS transaction_withdraw_count
FROM bounded
SAMPLE BY 1d FILL(0,0);
I found a way to solve this :)
In QuestDB you can declare query variables, and those variables can be not just literals but actually function calls. I declared @lower_bound and @upper_bound variables and I can just use them inside my query as many times as needed, but changing at a single point. This way it is impossible to get inconsistent values and broken results.
DECLARE
@merchant_id := 1,
@lower_bound := date_trunc('day', dateadd('d', -7, now())),
@upper_bound := now()
WITH bounded AS ( (
SELECT @lower_bound as timestamp, @merchant_id as merchant_id, 0 as deposit, 0 as withdraw
UNION ALL
SELECT timestamp, merchant_id,
CASE WHEN transaction_type = 'deposit' then 1 ELSE 0 END AS deposit,
CASE WHEN transaction_type = 'withdraw' then 1 ELSE 0 END AS withdraw
FROM transactions
WHERE merchant_id = 1 AND status = 'success'
AND timestamp between @lower_bound AND @upper_bound
UNION ALL
SELECT @upper_bound as timestamp, @merchant_id as merchant_id, 0 as deposit, 0 as withdraw
) timestamp(timestamp) )
SELECT
timestamp,
merchant_id,
SUM(deposit) AS transaction_deposit_count,
SUM(withdraw) AS transaction_withdraw_count
FROM bounded
SAMPLE BY 1d FILL(0,0);