sqldatabasepostgresqlaggregate-functions

Custom aggregate function in PostgreSQL


Is it possible to write an aggregate function in PostgreSQL that will calculate a delta value, by substracting the initial (last value in the column) from the current(first value in column) ? It would apply on a structure like this

rankings (userId, rank, timestamp)

And could be used like

SELECT userId, custum_agg(rank) OVER w 
FROM rankings
WINDOWS w AS (PARTITION BY userId ORDER BY timstamp desc)

returning for an userId the rank of the newest entry (by timestamp) - rank of the oldest entry (by timestamp)

Thanks!


Solution

  • the rank of the newest entry (by timestamp) - rank of the oldest entry (by timestamp)

    There are many ways to achieve this with existing functions. Like the window functions first_value() and last_value(), combined with DISTINCT or DISTINCT ON and without joins and subqueries:

    SELECT DISTINCT ON (userid)
           userid, last_value(rank) OVER w - first_value(rank) OVER w AS rank_delta
    FROM   rankings
    WINDOW w AS (PARTITION BY userid ORDER BY ts
                 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING);
    

    Note the custom window frame!

    Or you can use basic aggregate functions - with joins and a subquery:

    SELECT userid, r2.rank - r1.rank AS rank_delta
    FROM  (
      SELECT userid, min(ts) AS first_ts, max(ts) AS last_ts
       FROM  rankings
       GROUP BY 1
       ) sub
    JOIN   rankings r1 USING (userid)
    JOIN   rankings r2 USING (userid)
    WHERE  r1.ts = first_ts
    AND    r2.ts = last_ts;
    

    Assuming unique (userid, rank), or your requirements would be ambiguous.

    Shichinin no samurai

    ... a.k.a. "7 Samurai"
    The same for only the last seven rows per userid (or as many as can be found, if there are fewer) - one of the shortest ways:

    SELECT DISTINCT ON (userid)
           userid, first_value(rank) OVER w - last_value(rank) OVER w AS rank_delta
    FROM   rankings
    WINDOW w AS (PARTITION BY userid ORDER BY ts DESC
                 ROWS BETWEEN CURRENT ROW AND 7 FOLLOWING)
    ORDER  BY userid, ts DESC;
    

    Note the reversed sort order. The first row is the "newest" entry. I span a frame of (max.) 7 preceding rows and pick only the results for the newest entry per userid with DISTINCT ON.

    fiddle
    Old sqlfiddle 1, sqlfiddle 2