I have this old query using LT JOIN
to compare data with the previous item. I just noticed there is a LAG
function in QuestDB, and I would appreciate some help in getting the equivalent query of this to compare if execution is faster with LAG
WITH intervals AS (
SELECT
timestamp,
symbol AS metric,
amount AS volume
FROM trades
WHERE
timestamp in yesterday()
)
SELECT t1.timestamp, t1.metric AS metrics, t1.volume - t2.volume as delta, t1.volume as total
FROM intervals t1 LT JOIN intervals t2 ON(metric);
I tried this, but it is not working. It throws window function called in non-window context, make sure to add OVER clause
WITH intervals AS (
SELECT
timestamp,
symbol AS metric,
amount AS volume
FROM trades
WHERE
timestamp in yesterday()
)
SELECT t1.timestamp, t1.metric AS metrics, t1.volume - lag(t2.volume) as delta, t1.volume as total
FROM intervals t1 LT JOIN intervals t2 ON(metric);
The lag
function is a window function, so we need to define the "window frame", which typically requires either a timestamp (since LAG is time-sensitive, which is the previous item? How do I sort?) or/and a PARTITIONING
(should I just get the previous row, or the previous with the same SYMBOL value?).
The error we see is because of missing the OVER
part, which should be in this case over (partition by symbol order by timestamp)
.
In any case, if we try to add it directly at the point of the query with the lag
, we will get an error, as QuestDB, unlike other databases, does not allow any calculations on window function results. We need to do calculations on an outer query. In this case, we can just move the LAG
to the intervals
subquery, and then do the subtraction on the final query.
WITH intervals AS (
SELECT
timestamp,
symbol AS metric,
amount AS volume,
lag(amount) over (partition by symbol order by timestamp) as prev_volume
FROM trades
WHERE
timestamp in yesterday()
)
SELECT timestamp, metric AS metrics, volume - prev_volume as delta, volume as total
FROM intervals;
As a side note, the original LT JOIN
query might be problematic for the first rows if you have data on the same exact timestamp. If I have multiple rows on the same exact timestamp at the beginning of the results, since LT join
looks for strictly previous timestamp, all the rows are comparing with null
The window function based solution works strictly on position, not timestamp, so it correctly uses null only on the first row for each different symbol