I have a table my_table
like this:
id | updated_at | status
-----------------------------------------+-------------------------------+--------
ID1 | 2023-12-18 18:37:21.724825+00 | I
ID1 | 2024-01-02 23:08:03.50741+00 | I
ID1 | 2024-01-03 23:07:03.892617+00 | I
ID1 | 2024-01-06 23:09:00.877018+00 | A
ID1 | 2024-01-07 23:08:03.134111+00 | A
ID1 | 2024-01-08 23:06:31.164932+00 | A
ID1 | 2024-03-03 23:39:39.138697+00 | A
ID1 | 2024-03-04 23:55:54.582789+00 | F
ID1 | 2024-03-05 23:37:15.70314+00 | F
ID1 | 2024-03-06 23:47:16.814729+00 | F
ID1 | 2024-03-07 23:42:54.349137+00 | A
ID1 | 2024-03-08 23:35:53.188792+00 | A
ID1 | 2024-03-09 23:30:55.979575+00 | A
ID1 | 2024-04-10 08:30:05.329453+00 | F
ID2 | 2024-03-15 23:52:28.005298+00 | A
ID2 | 2024-03-16 23:53:38.031816+00 | A
ID2 | 2024-03-17 23:57:42.556396+00 | A
ID2 | 2024-03-18 13:33:38.264444+00 | A
ID2 | 2024-03-18 22:48:29.366489+00 | A
ID2 | 2024-03-19 15:00:02.19655+00 | F
ID2 | 2024-03-26 18:37:13.514644+00 | A
ID2 | 2024-03-27 15:19:13.823159+00 | F
ID2 | 2024-03-28 15:31:46.841049+00 | F
ID2 | 2024-04-03 07:43:08.253604+00 | P
I would like to select from it the dates at which the status changes from any value to the value F
, i.e I am expecting the following output:
ID1 | 2024-03-04 23:55:54.582789+00
ID1 | 2024-04-10 08:30:05.329453+00
ID2 | 2024-03-19 15:00:02.19655+00
ID2 | 2024-03-27 15:19:13.823159+00
So far, I managed to get the first time such a change occurs for each id
, but I can't get subsequent changes. Here is my query:
SELECT
id, updated_at
FROM (
SELECT
*
, RANK() OVER (PARTITION BY id ORDER BY updated_at ASC) AS rank_
FROM ( SELECT * FROM my_table WHERE status='F' ) t_sub
) T
WHERE rank_ = 1
How to change it to get not only the first time each id
turns to the status F
, but also second, third, fourth, etc... changes to F
?
Set up a subquery where you establish whether a row has status='F'
and the value of lag(status)over(partition by id order by updated_at)
window function is different from it, then only select those: demo
select id,updated_at
from(select *
,status<>lag(status,1,status)over w1 and status='F' as changed_to_f
from my_table
window w1 as (partition by id order by updated_at) ) as subquery
where changed_to_f;
id | updated_at |
---|---|
ID1 | 2024-03-04 23:55:54.582789+00 |
ID1 | 2024-04-10 09:30:05.329453+01 |
ID2 | 2024-03-19 15:00:02.19655+00 |
ID2 | 2024-03-27 15:19:13.823159+00 |
It's a strict interpretation: this does not catch the rows that are the initial row of a given ID
, because there's no prior row it would be changing from. If you do want to also catch those:
select id,updated_at
from(select *,status is distinct from lag(status)over w1 and status='F'
as changed_to_f
,status<>lag(status,1,status||'.')over w1 and status='F'
as also_changed_to_f
from my_table
window w1 as (partition by id order by updated_at) ) as subquery
where changed_to_f
order by 1,2;