rwebstatisticsweb-statistics

Time differences for multiple entries for same ID in R


For each ID that is the same, I would like to add an end time and then calculate the time differences between each entry for that user ID.

This is the code I have so far:

user <- user %>% group_by(user$userdata.user_id) %>% arrange(user$hours) %>% mutate(time.interval= user$hours - lag(user$hours, default = first(user$hours))) %>% mutate(time.interval = round(time.interval/86400, digits = 2))

I was trying to use the diff time() function, however since I am trying to calculate the time difference with a preset end date ('02-20-2020' = 7), I am unable to attain the following results:

id hours time.decimal time.interval

123 03:32:12 1.200 3.3 (4.5 - 1.2)

123 12:37:56 4.500 2.5 (7 - 4.5)

140 09:46:33 6.300 0.7 (7 - 6.3)

**Note: the above is an example of what I want to achieve. 7 in the time interval column is the time decimal version of the given end date.

Any help would be greatly appreciated.


Solution

  • You may use lead to get next value of time.decimal with default value as 7.

    library(dplyr)
    
    user %>%
      group_by(id) %>%
      mutate(time.interval = lead(time.decimal, default = 7) - time.decimal) %>%
      ungroup() -> user
    
    df
    
    #     id hours    time.decimal time.interval
    #  <int> <chr>           <dbl>         <dbl>
    #1   123 03:32:12          1.2           3.3
    #2   123 12:37:56          4.5           2.5
    #3   140 09:46:33          6.3           0.7
    

    Or in data.table :

    library(data.table)
    
    setDT(user)[, time.interval := shift(time.decimal, type = 'lead', fill = 7) - time.decimal, id]
    

    data

    user <- structure(list(id = c(123L, 123L, 140L), hours = c("03:32:12", 
    "12:37:56", "09:46:33"), time.decimal = c(1.2, 4.5, 6.3)), 
    class = "data.frame", row.names = c(NA, -3L))