sqlalgorithmapache-spark-sqlsnowplow

Calculate total user engagement on mobile


I want to calculate the total user engagement (session duration) done by a user in a particular mobile session. We also have events from mobile to track if a user background or foreground the app. The background and foreground events are tracked when the user moves the app to background (the app goes off screen) and foreground (the app becomes visible on the screen again).

Session Table:

Description (not part of table) event_name collector_tstamp session_id
Background the app application_background 2023-12-05T06:18:42.202+0000 A
Browsing screen_view 2023-12-05T06:18:32.097+0000 A
Foreground the app application_foreground 2023-12-05T06:18:31.955+0000 A
Background the app application_background 2023-12-05T05:40:28.096+0000 A
Browsing screen_view 2023-12-05T05:40:25.097+0000 A
Browsing screen_view 2023-12-05T05:40:23.097+0000 A
Open the app open_the_app 2023-12-05T05:40:22.000+0000 A

Expected output:

session_id Duration
A 17 seconds

Explanation: The time between first time background the app from the open the app is 6 seconds (2023-12-05T05:40:28.096+0000 minus 2023-12-05T05:40:22.000+0000).
The time between subsequent background the app from the last foreground the app is 11 seconds (2023-12-05T06:18:42.202+0000 minus 2023-12-05T06:18:31.955+0000). so 6 + 11 = 17 seconds.
And so on.

I want to write sql query to calculate session duration. Any sql language will work, I just want to understand the logic.


Solution

  • Based on the answer provided by Rob and Nick, I refer and combine them and created the below:

    WITH time_idle AS (
        SELECT
        sessionid,
        SUM(datediff(
            second,
            prev_timestamp,
            collector_tstamp)
        ) as time_idle_duration
        FROM
        (
          SELECT
              sessionid,
              event_name,
              collector_tstamp,
              LAG(event_name) OVER (ORDER BY collector_tstamp) AS prev_event,
              LAG(collector_tstamp) OVER (ORDER BY collector_tstamp) AS prev_timestamp
          FROM EVENTS
          WHERE event_name IN ('application_foreground', 'application_background')
        )
        WHERE event_name = 'application_foreground' and prev_event = 'application_background'
        group by sessionid
    )
    select
        DISTINCT
        sce.sessionid,
        datediff(
            second,
            MIN(sce.collector_tstamp) OVER (PARTITION BY sce.sessionid),
            MAX(sce.collector_tstamp) OVER (PARTITION BY sce.sessionid)
            )
        - ti.time_idle_duration as total_duration,
        datediff(
            second,
            MIN(sce.collector_tstamp) OVER (PARTITION BY sce.sessionid),
            MAX(sce.collector_tstamp) OVER (PARTITION BY sce.sessionid)
            ) AS absolute_time_duration,
          ti.time_idle_duration
    from EVENTS sce
    JOIN time_idle ti on sce.sessionid = ti.sessionid
    

    I am just calculating total down time by a user on a session, and subtract with total duration of a session.