I have a table with financial transactions, and want to create a running total per day. Is this the correct way to do it? Is there a better way? It seems to work, but I could not find this specific combination (SUM ... OVER
with GROUP BY
) on stackoverflow, and the nested SUM
asks for an extra pair of eyes:
SELECT date,
SUM(SUM(amount)) over (PARTITION BY TRUE ORDER BY date) AS value
FROM transactions
GROUP BY date
ORDER BY date ASC
Example data:
date amount
2024-11-01 00:00:00+00 - 1.9
2024-11-01 00:00:00+00 - 1.9
2024-11-02 00:00:00+00 - 1.9
2024-11-02 00:00:00+00 10
2024-11-05 00:00:00+00 3.9
2024-11-08 00:00:00+00 - 1.95
2024-11-13 00:00:00+00 - 10
2024-11-16 00:00:00+00 80
2024-11-18 00:00:00+00 85
2024-11-23 00:00:00+00 498.25
2024-11-27 00:00:00+00 -498.25
2024-11-30 00:00:00+00 -650
Expected result:
date amount
2024-11-01 00:00:00+00 - 3.8
2024-11-02 00:00:00+00 4.3
2024-11-05 00:00:00+00 8.2
2024-11-08 00:00:00+00 6.25
2024-11-13 00:00:00+00 - 3.75
2024-11-16 00:00:00+00 76.25
2024-11-18 00:00:00+00 161.25
2024-11-23 00:00:00+00 659.5
2024-11-27 00:00:00+00 161.25
2024-11-30 00:00:00+00 -488.75
Well, I think it's difficult to answer what is "correct" or "better" in this case. Obviously, your query is correct and produces the expected result. Except for the point that you will likely better cast the timestamp to a whole date to summarize data of entire days if the time differs.
A simpler way to get the intended result would be DISTINCT ON
:
SELECT
DISTINCT ON (trans_date::DATE)
trans_date::DATE AS trans_date,
SUM(amount) OVER(ORDER BY trans_date)
FROM transactions;
But this seems to be slower than your current idea:
SELECT
trans_date::DATE,
SUM(SUM(amount)) OVER (ORDER BY trans_date::DATE) AS value
FROM transactions
GROUP BY trans_date::DATE
ORDER BY trans_date::DATE;
I tried both queries on this db<>fiddle with some random rows. According to the query plan, the DISTINCT ON
requires more time.
So in my opinion, your query is fine as it is.