I can calculate the rainfall since 9am for a site using a series of 10 minute records as a cumulative value, however the nested SQL seems verbose and I'd like to make it more concise for future maintainability. I could use a CTE which might make it a little cleaner, but wondering if there's an approach to get this in a single call with the SUM & PARTITION BY and not use my current CASE approach?
The following postgres query calculates the rainfall since 9am using 10 minute values. Firstly I add a field determining whether the day is before or after 9am - in this case I want to include 9am and zero minutes as part of the previous days "rainfall since 9am" value, but anything after i.e. 9:01 should be the point at which the running total reset. I know I'm using 10 minute records in this example, but I do want do this down to the minute as there will be other sites reporting at minutely frequency but this example site is just reporting at 10 minutes.
SELECT
site_name,
date_time_utc AT TIME ZONE 'Australia/Hobart' AS date_time_local,
precip_10min,
precip_since_9am,
day_9am_val,
SUM(precip_10min) OVER
(PARTITION BY site_name, day_9am_val ORDER BY date_time_utc) AS precip_since_9am_cal
FROM
(
SELECT *,
CASE
WHEN DATE_PART('hour', date_time_utc AT TIME ZONE 'Australia/Hobart') > 9 THEN (date_time_utc AT TIME ZONE 'Australia/Hobart')::date
WHEN DATE_PART('hour', date_time_utc AT TIME ZONE 'Australia/Hobart') < 9 THEN ((date_time_utc AT TIME ZONE 'Australia/Hobart')::date - INTERVAL '1 day')::date
WHEN DATE_PART('minutes', date_time_utc AT TIME ZONE 'Australia/Hobart') > 0 THEN (date_time_utc AT TIME ZONE 'Australia/Hobart')::date
ELSE ((date_time_utc AT TIME ZONE 'Australia/Hobart')::date - INTERVAL '1 day')::date
END AS day_9am_val
FROM
temp_export
) tbl
ORDER BY
site_name, date_time_utc DESC
The expected value is provided as precip_since_9am and the calculated value (which should match) is precip_since_9am_cal. The day_9am_val is not required, it's just illustrating how I tackled partioning the days for a 9am cutoff. I've placed small subset of the data in SQL fiddle here, but also show some example rows below:
site_name | date_time_local | precip_10min | precip_since_9am | day_9am_val | precip_since_9am_cal |
---|---|---|---|---|---|
sitea | 2024-01-18 17:00:00 | 0.2 | 10.8 | 2024-01-18 | 10.8 |
sitea | 2024-01-18 16:50:00 | 0.4 | 10.6 | 2024-01-18 | 10.6 |
sitea | 2024-01-18 16:40:00 | 0.2 | 10.2 | 2024-01-18 | 10.2 |
sitea | 2024-01-18 16:30:00 | 0.2 | 10 | 2024-01-18 | 10.0 |
sitea | 2024-01-18 16:20:00 | 0.4 | 9.8 | 2024-01-18 | 9.8 |
I ended up rewriting to remove the sub-select per NickW's suggestion and include the case by directly in the partioning fields:
SELECT
site_name,
date_time_utc AT TIME ZONE 'Australia/Hobart' AS date_time_local,
precip_10min,
precip_since_9am,
SUM(precip_10min) OVER (PARTITION BY site_name, (CASE
WHEN (date_time_utc AT TIME ZONE 'Australia/Hobart')::time > '9:00'
THEN (date_time_utc AT TIME ZONE 'Australia/Hobart')::date
ELSE ((date_time_utc AT TIME ZONE 'Australia/Hobart')::date - INTERVAL '1 day')::date
END) ORDER BY date_time_utc) AS precip_since_9am_calc
FROM
temp_export
ORDER BY
site_name, date_time_utc DESC