azureazure-data-explorerkqlkusto-explorerplayfab

How recreate sessions in Kusto without session ID but with event associated to login


I got a problem about recreating sessions in Kusto. I got an event named client_session_start, which timetamp gives me info about when a player starts a new session. The problem is that I don't have an event for the session end, so I should label everything between those 2 events per each player with a new session ID. This is how it looks like

ID_player   Timestamp   event_name
aaa         12:00:00    client_session_start
aaa         12:30:00    client_session_start

In this case, I need to label everything between 12:00 and 12:30 and also the last session (past 12:30) which doesn't have the next ''client_session_start''.

I thought I could extract all the events named ''client_session_start'', rank them by user and timestamp, then merge the new session_id and use something like the SQL function last_value() to fill the gaps (something like this SQL QUERY replace NULL value in a row with a value from the previous known value)

but it seems I cannot do that in that way in Kusto. Any suggestion?


Solution

  • partition & row_cumsum()

    datatable (ID_player:string, Timestamp:timespan, event_name:string)
    [
         "aaa", "12:00:00", "client_session_start"
        ,"aaa", "12:10:00", ""
        ,"aaa", "12:20:00", ""
        ,"aaa", "12:30:00", "client_session_start"
        ,"bbb", "12:00:00", "client_session_start"
        ,"bbb", "12:10:00", ""
        ,"bbb", "12:20:00", ""
        ,"bbb", "12:30:00", "client_session_start" 
        ,"bbb", "12:40:00", ""    
        ,"bbb", "12:50:00", "client_session_start"  
        ,"bbb", "13:00:00", ""        
        ,"ccc", "12:00:00", "client_session_start" 
        ,"ccc", "12:10:00", ""
        ,"ccc", "12:20:00", ""
        ,"ccc", "12:30:00", ""
        ,"ccc", "12:40:00", ""             
    ]
    |   partition hint.strategy=native by ID_player
        (
                order by Timestamp asc
            |   extend session_index = row_cumsum(case(event_name == "client_session_start", 1, 0))
        )
    
    ID_player Timestamp event_name session_index
    aaa 12:00:00 client_session_start 1
    aaa 12:10:00 1
    aaa 12:20:00 1
    aaa 12:30:00 client_session_start 2
    bbb 12:00:00 client_session_start 1
    bbb 12:10:00 1
    bbb 12:20:00 1
    bbb 12:30:00 client_session_start 2
    bbb 12:40:00 2
    bbb 12:50:00 client_session_start 3
    bbb 13:00:00 3
    ccc 12:00:00 client_session_start 1
    ccc 12:10:00 1
    ccc 12:20:00 1
    ccc 12:30:00 1
    ccc 12:40:00 1

    Fiddle