sqlpostgresqlaggregate-functionswindow-functionsgenerate-series

Cumulative sum of values by month, filling in for missing months


I have this data table and I'm wondering if is possible create a query that get a cumulative sum by month considering all months until the current month.

date_added                    | qty
------------------------------------
2015-08-04 22:28:24.633784-03 | 1
2015-05-20 20:22:29.458541-03 | 1
2015-04-08 14:16:09.844229-03 | 1
2015-04-07 23:10:42.325081-03 | 1
2015-07-06 18:50:30.164932-03 | 1
2015-08-22 15:01:54.03697-03  | 1
2015-08-06 18:25:07.57763-03  | 1
2015-04-07 23:12:20.850783-03 | 1
2015-07-23 17:45:29.456034-03 | 1
2015-04-28 20:12:48.110922-03 | 1
2015-04-28 13:26:04.770365-03 | 1
2015-05-19 13:30:08.186289-03 | 1
2015-08-06 18:26:46.448608-03 | 1
2015-08-27 16:43:06.561005-03 | 1
2015-08-07 12:15:29.242067-03 | 1

I need a result like that:

Jan|0
Feb|0
Mar|0
Apr|5
May|7
Jun|7
Jul|9
Aug|15

Solution

  • This is very similar to other questions, but the best query is still tricky.

    Basic query to get the running sum quickly:

    SELECT to_char(date_trunc('month', date_added), 'Mon YYYY') AS mon_text
         , sum(sum(qty)) OVER (ORDER BY date_trunc('month', date_added)) AS running_sum
    FROM   tbl
    GROUP  BY date_trunc('month', date_added)
    ORDER  BY date_trunc('month', date_added);
    

    The tricky part is to fill in for missing months:

    WITH cte AS (
       SELECT date_trunc('month', date_added) AS mon, sum(qty) AS mon_sum
       FROM   tbl
       GROUP  BY 1
       )
    SELECT to_char(mon, 'Mon YYYY') AS mon_text
         , sum(c.mon_sum) OVER (ORDER BY mon) AS running_sum
    FROM  (SELECT min(mon) AS min_mon FROM cte) init
         , generate_series(init.min_mon, now(), interval '1 month') mon
    LEFT   JOIN cte c USING (mon)
    ORDER  BY mon;
    

    The implicit CROSS JOIN LATERAL requires Postgres 9.3+. This starts with the first month in the table.
    To start with a given month:

    WITH cte AS (
       SELECT date_trunc('month', date_added) AS mon, sum(qty) AS mon_sum
       FROM   tbl
       GROUP  BY 1
       )
    SELECT to_char(mon, 'Mon YYYY') AS mon_text
         , COALESCE(sum(c.mon_sum) OVER (ORDER BY mon), 0) AS running_sum
    FROM   generate_series('2015-01-01'::date, now(), interval '1 month') mon
    LEFT   JOIN cte c USING (mon)
    ORDER  BY mon;

    fiddle
    Old sqlfiddle

    Keeping months from different years apart. You did not ask for that, but you'll most likely want it.

    Note that the "month" to some degree depends on the time zone setting of the current session! Details:

    Related: