rdatabaserowreorganize

Re-organize database by merging rows from a variable


I have a database that looks like this:

userId          SessionId        Screen         Platform       Version
01              1                first          IOS            1.0.1
01              1                main           IOS            1.0.1
01              2                first          IOS            1.0.1
01              3                first          IOS            1.0.1
01              3                main           IOS            1.0.1
01              3                detail         IOS            1.0.1
02              1                first          Android        1.0.2

Basically what I intend to do is to determine if a "path" (different screens) leads to a better retention or not. I would like to re-organize each sessionId in one column. Ideal database would look like this:

userId       SessionId       Path                 Retention
01           1               first;main           3
01           2               first                3
01           3               first;main;detail    3
02           1               first                1

Here is the variable Retention would be equal to the maximum SessionId.


Solution

  • A possible solution in base R:

    d2 <- aggregate(Screen ~ userId + SessionId, d, toString)
    transform(d2, retention = ave(Screen, userId, FUN = length))
    

    which gives:

    > d2
      userId SessionId              Screen retention
    1     01         1         first, main         3
    2     02         1               first         1
    3     01         2               first         3
    4     01         3 first, main, detail         3
    

    An alternative using dplyr:

    library(dplyr)
    d %>% 
      group_by(userId, SessionId) %>% 
      summarise(Screen = toString(Screen)) %>% 
      group_by(userId) %>% 
      mutate(retention = n())
    

    which gives:

      userId SessionId              Screen retention
       <chr>     <int>               <chr>     <int>
    1     01         1         first, main         3
    2     01         2               first         3
    3     01         3 first, main, detail         3
    4     02         1               first         1