sqloracle-databaseanalytics

Rewrite query to use Analytic Functions


I have a Table Events which logs Insert, Update and Delete of Events. See the MWE her: http://sqlfiddle.com/#!4/6c2cb1/1

DDL Statement

CREATE TABLE "EVENTS" 
   (
    "EVENT_ID" VARCHAR2(30 CHAR), --Name of the Event
    "EVENT_LOCATION" VARCHAR2(60 CHAR), --Location on which the event occured
    "EVENT_TRIGGER" VARCHAR2(2 CHAR),  --Trigger which protocolled the event (I,U or D)
    "EVENT_CHANGE_ID" NUMBER,  --Unique Sequence Number
    "EVENT_CHANGE_DATE" DATE DEFAULT SYSTIMESTAMP
   );

INSERT INTO EVENTS (EVENT_ID,EVENT_LOCATION,EVENT_TRIGGER,EVENT_CHANGE_ID,EVENT_CHANGE_DATE) 
VALUES ('EVENT1','LOC1','I',1,SYSTIMESTAMP-1);
INSERT INTO EVENTS (EVENT_ID,EVENT_LOCATION,EVENT_TRIGGER,EVENT_CHANGE_ID,EVENT_CHANGE_DATE) 
VALUES ('EVENT1','LOC2','U',11,SYSTIMESTAMP-1);
INSERT INTO EVENTS (EVENT_ID,EVENT_LOCATION,EVENT_TRIGGER,EVENT_CHANGE_ID,EVENT_CHANGE_DATE) 
VALUES ('EVENT1','LOC4','U',117,SYSTIMESTAMP-1);
INSERT INTO EVENTS (EVENT_ID,EVENT_LOCATION,EVENT_TRIGGER,EVENT_CHANGE_ID,EVENT_CHANGE_DATE) 
VALUES ('EVENT1','LOC7','D',1430,SYSTIMESTAMP-1);

INSERT INTO EVENTS (EVENT_ID,EVENT_LOCATION,EVENT_TRIGGER,EVENT_CHANGE_ID,EVENT_CHANGE_DATE) 
VALUES ('EVENT2','LOC1','I',2,SYSTIMESTAMP-1/48);
INSERT INTO EVENTS (EVENT_ID,EVENT_LOCATION,EVENT_TRIGGER,EVENT_CHANGE_ID,EVENT_CHANGE_DATE) 
VALUES ('EVENT2','LOC2','U',131,SYSTIMESTAMP-1/48);
INSERT INTO EVENTS (EVENT_ID,EVENT_LOCATION,EVENT_TRIGGER,EVENT_CHANGE_ID,EVENT_CHANGE_DATE) 
VALUES ('EVENT2','LOC5','D',11337,SYSTIMESTAMP-1/48);
INSERT INTO EVENTS (EVENT_ID,EVENT_LOCATION,EVENT_TRIGGER,EVENT_CHANGE_ID,EVENT_CHANGE_DATE) 
VALUES ('EVENT2','LOC7','D',14430,SYSTIMESTAMP-1/48);

I want to determine the amount of events which have been Inserted at LOC1 and Deleted at LOC7 without any Deletions in between.

SELECT COUNT(*) AS QTY, TRUNC(A.EVENT_CHANGE_DATE) AS DAY
FROM (
    SELECT EVENT_ID, EVENT_CHANGE_ID, EVENT_CHANGE_DATE FROM EVENTS WHERE EVENT_TRIGGER = 'I' AND EVENT_LOCATION = 'LOC1'
    ) A,
    (SELECT EVENT_ID, EVENT_CHANGE_ID, EVENT_CHANGE_DATE FROM EVENTS WHERE EVENT_TRIGGER = 'D' AND EVENT_LOCATION = 'LOC7')
    B
WHERE B.EVENT_CHANGE_ID > A.EVENT_CHANGE_ID AND A.EVENT_ID = B.EVENT_ID
    AND not exists (SELECT EVENT_ID, EVENT_CHANGE_ID, EVENT_CHANGE_DATE FROM EVENTS WHERE EVENT_TRIGGER = 'D' AND EVENT_CHANGE_ID > A.EVENT_CHANGE_ID AND EVENT_CHANGE_ID < B.EVENT_CHANGE_ID and EVENT_ID = A.EVENT_ID) 
group by TRUNC(A.EVENT_CHANGE_DATE)
ORDER BY TRUNC(A.EVENT_CHANGE_DATE);

My naive approach works, however I am wondering if this query could be rewritten using analytic functions. The original Tables contains up to 1 Million records an 3x Full Table Scan is a nogo regarding execution time and performance.

Is it even possible to make this query more efficient with analytical functions?


Solution

  • This looks like a good fit for SQL pattern matching:

    select * from events
    match_recognize (
      partition by event_id
      order by event_change_date
      measures 
        count ( ins.* ) ins_count,
        min ( event_change_date ) dt
      pattern ( ins upd* del )
      define 
        ins as event_trigger = 'I' and event_location = 'LOC1',
        upd as event_trigger = 'U',
        del as event_trigger = 'D' and event_location = 'LOC7'
    );
    
    INS_COUNT    DT                     
               1 16-MAR-2020 12:33:58 
    

    This searches for an I(nserts) at LOC1 followed by a D(elete) at LOC7, with any number of U(pdates) in-between.