[SOLVED] Custom aggregate function in PostgreSQL

Custom aggregate function in PostgreSQL

Is it possible to write an aggregate function in PostgreSQL that will calculate a delta value, by substracting the initial (last value in the column) from the current(first value in column) ? It would apply on a structure like this

rankings (userId, rank, timestamp)

And could be used like

SELECT userId, custum_agg(rank) OVER w 
FROM rankings
WINDOWS w AS (PARTITION BY userId ORDER BY timstamp desc)

returning for an userId the rank of the newest entry (by timestamp) - rank of the oldest entry (by timestamp)

Thanks!

Solution

the rank of the newest entry (by timestamp) - rank of the oldest entry (by timestamp)

There are many ways to achieve this with existing functions. Like the window functions first_value() and last_value(), combined with DISTINCT or DISTINCT ON and without joins and subqueries:

SELECT DISTINCT ON (userid)
       userid, last_value(rank) OVER w - first_value(rank) OVER w AS rank_delta
FROM   rankings
WINDOW w AS (PARTITION BY userid ORDER BY ts
             ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING);

Note the custom window frame!

Or you can use basic aggregate functions - with joins and a subquery:

SELECT userid, r2.rank - r1.rank AS rank_delta
FROM  (
  SELECT userid, min(ts) AS first_ts, max(ts) AS last_ts
   FROM  rankings
   GROUP BY 1
   ) sub
JOIN   rankings r1 USING (userid)
JOIN   rankings r2 USING (userid)
WHERE  r1.ts = first_ts
AND    r2.ts = last_ts;

Assuming unique (userid, rank), or your requirements would be ambiguous.

Shichinin no samurai

^{... a.k.a. "7 Samurai"}
The same for only the last seven rows per userid (or as many as can be found, if there are fewer) - one of the shortest ways:

SELECT DISTINCT ON (userid)
       userid, first_value(rank) OVER w - last_value(rank) OVER w AS rank_delta
FROM   rankings
WINDOW w AS (PARTITION BY userid ORDER BY ts DESC
             ROWS BETWEEN CURRENT ROW AND 7 FOLLOWING)
ORDER  BY userid, ts DESC;

Note the reversed sort order. The first row is the "newest" entry. I span a frame of (max.) 7 preceding rows and pick only the results for the newest entry per userid with DISTINCT ON.

fiddle
_{Old sqlfiddle 1, sqlfiddle 2}