snowflake-cloud-data-platformmatch-recognize

MATCH_RECOGNIZE to find events that do not end with certain pattern


I have the transaction audit history as following. All audits start with a INIT and may have a REVERSAL and a MODIFICATION. I need to find all the audits with INIT -> REVERSAL but no MODIFICATION

TRANSACTION_ID ACTION AMOUNT AUDIT_TS
0 INIT 14 2022-07-10 14:49:17
111 INIT 10 2022-07-10 14:03:09
111 REVERSAL 10 2022-07-10 14:24:10
111 MODIFICATION 8 2022-07-10 14:49:11
222 INIT 12 2022-07-10 14:07:12
222 REVERSAL 12 2022-07-10 14:24:12
222 MODIFICATION 9 2022-07-10 14:43:13
333 REVERSAL 15 2022-07-10 14:26:14
333 MODIFICATION 14 2022-07-10 14:43:15
444 INIT 14 2022-07-10 14:42:16
555 INIT 5 2022-07-10 14:08:18
555 REVERSAL 5 2022-07-10 14:26:19

I tried using the Regex End Anchor ($) as following, but it also returns the audits that end with MODIFICATION:

select 
 *
from audit_table
match_recognize(
    partition by transaction_id
    order by audit_ts
    ALL ROWS PER MATCH
    PATTERN (INIT REVERSAL$)
    DEFINE
          INIT as iff(ACTION='INIT',TRUE,FALSE),
          REVERSAL as iff(ACTION='REVERSAL',TRUE,FALSE),
          MODIFICATION as iff(ACTION='MODIFICATION',TRUE,FALSE)
);

Please advise.


Solution

  • The pattern needs to be altered using quantifiers ^INIT{1} REVERSAL{1}$:

    select  *
    from audit_table
    match_recognize(
        partition by transaction_id
        order by audit_ts
        ALL ROWS PER MATCH
        PATTERN (^INIT{1} REVERSAL{1}$)
        DEFINE
             INIT as ACTION='INIT',
             REVERSAL as ACTION='REVERSAL',
             MODIFICATION as ACTION='MODIFICATION'
    )
    ORDER BY transaction_id, audit_ts;
    

    Snowflake supports boolean data type natively so:

    INIT as iff(ACTION='INIT',TRUE,FALSE),
    <=>
    INIT as ACTION='INIT',
    

    Output:

    enter image description here