I have a dataframe named "log" that looks like this:
ID time user
1 2014-09-19 13:57:36 Alice
2 2014-09-19 13:58:06 Alice
3 2014-09-19 13:58:16 Alice
4 2014-09-19 13:58:36 Alice
5 2014-09-19 13:59:56 Alice
6 2014-09-19 13:59:56 Bob
7 2014-09-19 13:59:57 Bob
8 2014-09-19 13:59:59 Bob
9 2014-09-19 14:00:04 Bob
I would like to add a new column with type of difftime. It is 0 when the user appeared for the first time, otherwise it should be current time minus previous time. The desired output should be:
ID time user timediff
1 2014-09-19 13:57:36 Alice 0
2 2014-09-19 13:58:06 Alice 30
3 2014-09-19 13:58:16 Alice 10
4 2014-09-19 13:58:36 Alice 20
5 2014-09-19 13:59:56 Alice 20
6 2014-09-19 13:59:56 Bob 0
7 2014-09-19 13:59:57 Bob 1
8 2014-09-19 13:59:59 Bob 2
9 2014-09-19 14:00:04 Bob 5
My plan was to create a vector of difftime with size 9, but I got an error.
t <- difftime(9)
Error in as.POSIXct.numeric(time1) : 'origin' must be supplied
I'm a beginner in R, could anyone be kind enough to show me how to do this? Thank you very much.
difftime
is not a class, but rather it's a function for finding time differences. So you cannot initialize a date-time vector with it. You'll need as.POSIXct
to convert the time
column to date-times then we can just use diff
since it has a method for POSIXt
You could use dplyr
for the grouped calculations.
library(dplyr)
group_by(log, user) %>% mutate(timediff = c(0, diff(as.POSIXct(time))))
# ID time user timediff
# 1 1 2014-09-19 13:57:36 Alice 0
# 2 2 2014-09-19 13:58:06 Alice 30
# 3 3 2014-09-19 13:58:16 Alice 10
# 4 4 2014-09-19 13:58:36 Alice 20
# 5 5 2014-09-19 13:59:56 Alice 80
# 6 6 2014-09-19 13:59:56 Bob 0
# 7 7 2014-09-19 13:59:57 Bob 1
# 8 8 2014-09-19 13:59:59 Bob 2
# 9 9 2014-09-19 14:00:04 Bob 5
Note: You may have miscalculated the result for the time difference on the last row of Alice
. They are actually different minutes.