rvector

Creating a empty vector of difftime type


I have a dataframe named "log" that looks like this:

  ID    time                        user     
  1     2014-09-19 13:57:36         Alice    
  2     2014-09-19 13:58:06         Alice    
  3     2014-09-19 13:58:16         Alice    
  4     2014-09-19 13:58:36         Alice    
  5     2014-09-19 13:59:56         Alice    
  6     2014-09-19 13:59:56         Bob      
  7     2014-09-19 13:59:57         Bob      
  8     2014-09-19 13:59:59         Bob      
  9     2014-09-19 14:00:04         Bob      

I would like to add a new column with type of difftime. It is 0 when the user appeared for the first time, otherwise it should be current time minus previous time. The desired output should be:

  ID    time                        user      timediff
  1     2014-09-19 13:57:36         Alice     0
  2     2014-09-19 13:58:06         Alice     30
  3     2014-09-19 13:58:16         Alice     10
  4     2014-09-19 13:58:36         Alice     20
  5     2014-09-19 13:59:56         Alice     20
  6     2014-09-19 13:59:56         Bob       0
  7     2014-09-19 13:59:57         Bob       1
  8     2014-09-19 13:59:59         Bob       2
  9     2014-09-19 14:00:04         Bob       5

My plan was to create a vector of difftime with size 9, but I got an error.

t <- difftime(9)

Error in as.POSIXct.numeric(time1) : 'origin' must be supplied

I'm a beginner in R, could anyone be kind enough to show me how to do this? Thank you very much.


Solution

  • difftime is not a class, but rather it's a function for finding time differences. So you cannot initialize a date-time vector with it. You'll need as.POSIXct to convert the time column to date-times then we can just use diff since it has a method for POSIXt

    You could use dplyr for the grouped calculations.

    library(dplyr)
    group_by(log, user) %>% mutate(timediff = c(0, diff(as.POSIXct(time))))
    #   ID                time  user timediff
    # 1  1 2014-09-19 13:57:36 Alice        0
    # 2  2 2014-09-19 13:58:06 Alice       30
    # 3  3 2014-09-19 13:58:16 Alice       10
    # 4  4 2014-09-19 13:58:36 Alice       20
    # 5  5 2014-09-19 13:59:56 Alice       80
    # 6  6 2014-09-19 13:59:56   Bob        0
    # 7  7 2014-09-19 13:59:57   Bob        1
    # 8  8 2014-09-19 13:59:59   Bob        2
    # 9  9 2014-09-19 14:00:04   Bob        5
    

    Note: You may have miscalculated the result for the time difference on the last row of Alice. They are actually different minutes.