sqlteradatateradata-sql-assistant

Lag or first_value doesn't work when year is changed


    with
    --my input--
    x(id1,med_id,dt,dt_tm,casemgr_id,casemgr_clntid,status) as (
    select 123456,98410,date'2024-04-19',timestamp'2024-04-19 09:00:00',12345,67891,-2
    union all
    select 194567,98410,date'2024-04-19',timestamp'2024-04-19 11:00:00',12345,67891,-2
    union all
    select 789101,98410,date'2024-04-24',timestamp'2024-04-24 09:00:00',12345,67891,-2
    union all
    select 194587,98410,date'2024-04-25',timestamp'2024-04-25 09:00:00',12345,67891,-2
    union all
    select 234561,98410,date'2024-04-26',timestamp'2024-04-26 09:00:00',12345,67891,-2
    union all
    select 456789,91956,date'2024-04-19',timestamp'2024-04-19 08:00:00',99012,87567,-2
    union all
    select 998415,91956,date'2024-12-20',timestamp'2024-12-20 07:00:00',99012,87567,-2
union all
  select 01786,18745,date'2023-11-16',timestamp'2023-11-16 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2023-11-21',timestamp'2023-11-21 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2023-12-01',timestamp'2023-12-01 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2023-12-04',timestamp'2023-12-04 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2024-02-02',timestamp'2024-02-02 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2024-07-05',timestamp'2024-07-05 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2024-07-21',timestamp'2024-07-21 07:00:00',12434,87438,-2
union all
  select 01786,18745,date'2024-07-21',timestamp'2024-07-23 07:00:00',12434,87438,-2
    );

Input above. What I am trying to achieve is that for a combination of med_id, casemgr_clntid and dt, the first record sorted by date should be valid, and the subsequent records should be invalid if it falls within 180 days. Then the code should look for the first record after the 180 day timeframe, that record should be valid, and this record should be the starting point and the subsequent rows should be invalid. If greater than 180 days, it should be valid. if there are multiple records on the same day, like in the first two instances, the earlier record should be valid since it's earlier timestamp and the other should be invalid since it's later in the day, In other words, the first record at 9am should be valid while the second record should be invalid. I have also put the desired output at the end. So for example in the last three rows, where the record on 5th july is valid since it's the first record after the 180 day time window, but subsequently it should be invalid (in other words, it has restarted the time window). So the records on 21st July and 23rd july should take the record on 5th july as a starting point

-- sql qry which was suggested, tweaked to explictly put the date format...
SELECT
  id1
, med_id
, dt
,dt_tm
, casemgr_id
, casemgr_clntid
, CASE WHEN (
           to_char(dt,'yyyy-mm-dd)' -
         , lag(to_char(dt,'yyyy-mm-dd')) OVER(PARTITION BY med_id, casemgr_clntid ORDER BY dt_tm)
         )  is null
      OR (
           to_char(dt,'yyyy-mm-dd)' -
         , lag(to_char(dt,'yyyy-mm-dd')) OVER(PARTITION BY med_id, casemgr_clntid ORDER BY dt_tm)

         )  >180
    THEN 'valid'
    ELSE 'invalid'
  END AS status
FROM x
ORDER BY med_id, casemgr_clntid, dt_tm 
;

I also tried subtracting by dt_tm, but the dt_tm-first_value(dt_tm) doesn't work.

Error: invalid operation

    with
        --my desired output--
        x(id1,med_id,dt,dt_tm,casemgr_id,casemgr_clntid,status) as (
        select 123456,98410,date'2024-04-19',timestamp'2024-04-19 09:00:00',12345,67891, 'valid'
        union all
        select 194567,98410,date'2024-04-19',timestamp'2024-04-19 11:00:00',12345,67891,'invalid'
        union all
        select 789101,98410,date'2024-04-24',timestamp'2024-04-24 09:00:00',12345,67891,'invalid'
        union all
        select 194587,98410,date'2024-04-25',timestamp'2024-04-25 09:00:00',12345,67891,'invalid'
        union all
        select 234561,98410,date'2024-04-26',timestamp'2024-04-26 09:00:00',12345,67891,'invalid'
        union all
        select 456789,91956,date'2024-04-19',timestamp'2024-04-19 08:00:00',99012,87567,'valid'
        union all
        select 998415,91956,date'2024-12-20',timestamp'2024-12-20 07:00:00',99012,87567,'valid'
union all
  select 01786,18745,date'2023-11-16',timestamp'2023-11-16 07:00:00',12434,87438,valid
union all
  select 01786,18745,date'2023-11-21',timestamp'2023-11-21 07:00:00',12434,87438,invalid
union all
  select 01786,18745,date'2023-12-01',timestamp'2023-12-01 07:00:00',12434,87438,invalid
union all
  select 01786,18745,date'2023-12-04',timestamp'2023-12-04 07:00:00',12434,87438,invalid
union all
  select 01786,18745,date'2024-02-02',timestamp'2024-02-02 07:00:00',12434,87438,invalid
union all
  select 01786,18745,date'2024-07-05',timestamp'2024-07-05 07:00:00',12434,87438,valid
union all
  select 01786,18745,date'2024-07-21',timestamp'2024-07-21 07:00:00',12434,87438,invalid
union all
  select 01786,18745,date'2024-07-23',timestamp'2024-07-23 07:00:00',12434,87438,invalid

        );

Solution

  • If I am not mistaken, the dt column is redundant, as the date is included in dt_tm. And the whole thing is not about the first dt per med_id and casemgr_clntid anyway, but about their first (i.e. minimum) dt_tm.

    SELECT
      id1,
      med_id,
      dt,
      dt_tm,
      casemgr_id,
      casemgr_clntid,
      CASE WHEN dt_tm = MIN(dt_tm) OVER(PARTITION BY med_id, casemgr_clntid)
             OR dt_tm >= CAST(MIN(dt_tm) OVER(PARTITION BY med_id, casemgr_clntid) AS DATE)
                         + INTERVAL '181' DAY
        THEN 'valid'
        ELSE 'invalid'
      END AS status
    FROM x
    ORDER BY med_id, casemgr_clntid, dt_tm;