sqlvertica

Time difference between rows based on condition


Not really an expert with SQL and im having problems figuring out how to do this one.

Got a table like this one:

ID Message TimeStamp User
1 Hello 2022-08-01 10:00:00 A
1 How are you? 2022-08-01 10:00:05 A
1 Hello there 2022-08-01 10:00:10 B
1 I am okay 2022-08-01 10:00:12 B
1 Good to know 2022-08-01 10:00:15 A
1 Bye 2022-08-01 10:00:25 B
2 Hello 2022-08-01 10:02:50 A
2 Hi 2022-08-01 10:03:50 B

I need to calculate the time difference each time there is a response from the B user after a message from A.

Expected result would be like this

ID Difference
1 5
1 10
2 60

Trying to use Lead function to obtain the next desired timestamp but im not getting the expected result Any tips or advice?

Thanks


Solution

  • Even if it is already answered - it's a nice use case for Vertica's MATCH() clause. Looking for a pattern consisting of sender = 'A' followed by sender = 'B'. You get a pattern id, and then you can group by other stuff plus the pattern id to get max timestamp and min timestamp.

    Also note that I renamed both "user" and "timestamp", as they are reserved words...

    -- your input, don't use in final query
    indata(ID,Message,ts,Usr) AS (
              SELECT 1,'Hello'       ,TIMESTAMP '2022-08-01 10:00:00','A'
    UNION ALL SELECT 1,'How are you?',TIMESTAMP '2022-08-01 10:00:05','A'
    UNION ALL SELECT 1,'Hello there' ,TIMESTAMP '2022-08-01 10:00:10','B'
    UNION ALL SELECT 1,'I am okay'   ,TIMESTAMP '2022-08-01 10:00:12','B'
    UNION ALL SELECT 1,'Good to know',TIMESTAMP '2022-08-01 10:00:15','A'
    UNION ALL SELECT 1,'Bye'         ,TIMESTAMP '2022-08-01 10:00:25','B'
    UNION ALL SELECT 2,'Hello'       ,TIMESTAMP '2022-08-01 10:02:50','A'
    UNION ALL SELECT 2,'Hi'          ,TIMESTAMP '2022-08-01 10:03:50','B'
    )
    -- end of input, real query starts here , replace following comma with "WITH" 
    ,
    w_match_clause AS (
      SELECT                                                                                                                                             
        *
      , event_name()
      , pattern_id()
      , match_id()
      FROM indata
      MATCH (
        PARTITION BY id ORDER BY ts
        DEFINE
          sentbya AS usr='A'
        , sentbyb AS usr='B'
        PATTERN
        p AS (sentbya sentbyb)
      )
       -- ctl SELECT * FROM w_match_clause;
       -- ctl  ID |   Message    |         ts          | Usr | event_name | pattern_id | match_id 
       -- ctl ----+--------------+---------------------+-----+------------+------------+----------
       -- ctl   1 | How are you? | 2022-08-01 10:00:05 | A   | sentbya    |          1 |        1
       -- ctl   1 | Hello there  | 2022-08-01 10:00:10 | B   | sentbyb    |          1 |        2
       -- ctl   1 | Good to know | 2022-08-01 10:00:15 | A   | sentbya    |          2 |        1
       -- ctl   1 | Bye          | 2022-08-01 10:00:25 | B   | sentbyb    |          2 |        2
       -- ctl   2 | Hello        | 2022-08-01 10:02:50 | A   | sentbya    |          1 |        1
       -- ctl   2 | Hi           | 2022-08-01 10:03:50 | B   | sentbyb    |          1 |        2
    )
    SELECT
      id
    , MAX(ts) - MIN(ts) AS difference
    FROM w_match_clause
    GROUP BY
      id
    , pattern_id
    ORDER BY
      id;
    -- out  id | difference 
    -- out ----+------------
    -- out   1 | 00:00:05
    -- out   1 | 00:00:10
    -- out   2 | 00:01