I have a table with logs of going inside and outside the building. The table looks like that:
user_id | datetime | direction |
---|---|---|
1 | 17/2/2025, 18:25:02.000 | in |
1 | 17/2/2025, 20:09:10.000 | out |
2 | 17/2/2025, 09:55:57.000 | in |
2 | 17/2/2025, 20:48:37.000 | out |
In fact I have hundreds of people every day and long period, not only one day. I need to count maximum number of people inside the building within each hour. I think it could be easier if I could take first entrance and last leaving within each day, but it will be incorrect since person may go outside and inside of building several times within a day and I only need to count hours, where person is inside.
The result table should look like this:
hour | count |
---|---|
17/2/2025, 09:00 | 1 |
17/2/2025, 10:00 | 1 |
17/2/2025, 11:00 | 1 |
17/2/2025, 12:00 | 1 |
17/2/2025, 13:00 | 1 |
17/2/2025, 14:00 | 1 |
17/2/2025, 15:00 | 1 |
17/2/2025, 16:00 | 1 |
17/2/2025, 17:00 | 1 |
17/2/2025, 18:00 | 2 |
17/2/2025, 19:00 | 2 |
17/2/2025, 20:00 | 2 |
is there a way to calculate this in PostgreSQL?
You can use generate_series, a range type and the overlap operator in a window function:
WITH input AS (
SELECT *
, tsrange(lag(datetime) OVER inout, datetime, '[)') AS timeframe
FROM mytable
WINDOW inout as (PARTITION BY user_id ORDER BY datetime)
)
SELECT ts,
count(user_id)
FROM generate_series('2025-02-17 08:00'::timestamp, '2025-02-18 14:00'::timestamp, interval '1 hour') g(ts)
LEFT JOIN input ON timeframe && tsrange(ts, ts + interval '1 hour')
AND direction = 'out' -- must have checked out
GROUP BY 1
ORDER BY 1;
See also https://dbfiddle.uk/HQG5DQOM