I have a 400GB table where a new entry will be added every 2 seconds. This table holds event states for microchips. I have a created_at Datetime field and a topic String field to determine the event state. My task is now to delete all idle events in a 5 minute interval beginning from the first idle entry.
For example:
id | created_at | topic |
---|---|---|
UUIDv4 | 2022-03-13 12:42:16 | MAC-ADDRESS1/idle |
UUIDv4 | 2022-03-13 12:42:18 | MAC-ADDRESS1/idle |
UUIDv4 | 2022-03-13 12:42:20 | MAC-ADDRESS1/idle |
UUIDv4 | 2022-03-13 12:42:16 | MAC-ADDRESS2/idle |
UUIDv4 | 2022-03-13 12:42:18 | MAC-ADDRESS2/idle |
UUIDv4 | 2022-03-13 12:42:20 | MAC-ADDRESS2/idle |
... | ... | ... |
UUIDv4 | 2022-03-13 12:47:16 | MAC-ADDRESS1/idle |
UUIDv4 | 2022-03-13 12:47:18 | MAC-ADDRESS1/idle |
UUIDv4 | 2022-03-13 12:47:16 | MAC-ADDRESS2/idle |
UUIDv4 | 2022-03-13 12:47:18 | MAC-ADDRESS2/idle |
If I start at 2022-03-13 12:42:16 for MAC-ADDRESS1, I need the entry on 2022-03-13 12:42:16 and 2022-03-13 12:47:16. Every entry between these two dates for this Mac must be deleted. Then I start at 2022-03-13 12:47:16 and go to 2022-03-13 12:52:16 and delete all entries between these two. And I need to do this for every chip (ca. 1842).
I've built a script with PHP, but this is too slow. Now, I want to build a query to execute it directly in PostgreSQL.
Is this possible and if yes, how does the query needs to be?
You can create a procedure that you execute every 5 minutes
DELETE FROM <TABLE>
WHERE createdAt >= @yourDate AND createAt <= @yourDate + interval '5 minute'
Than use procedureName('2022-03-13 12:42:16')
For job scheduling: https://www.pgadmin.org/docs/pgadmin4/development/pgagent.html