rloops

Calculate Time Elapsed Since Last Events with Multiple Event Types and IDs


Similar questions have been asked before where the question is how to calculate the number of observations since an event. I have a further request. How to calculate the number of days since the same type of observation but also to find the number of days since any other type of event. I also have ids.

To illustrate please see below. I am trying to do this in R using Datatables but to little result.

What I have:

  date     event  id
2000-07-06     2  1
2000-07-07     1  1
2000-07-09     0  1
2000-07-10     0  1
2000-07-15     2  1
2000-07-16     1  1
2000-07-20     0  1
2000-07-21     1  1
2000-07-06     1  2
2000-07-07     2  2
2000-07-15     0  2
2000-07-16     0  2
2000-07-17     2  2
2000-07-18     1  2

and what I would like to have is as follows:

  date     event  id days_since_event_1 days_since_event_2
2000-07-06     2  1        NA                 NA
2000-07-07     1  1        NA                 1
2000-07-09     0  1        2                  3
2000-07-10     0  1        3                  4
2000-07-15     2  1        8                  9
2000-07-16     1  1        9                  1
2000-07-20     0  1        4                  5
2000-07-21     1  1        5                  6
2000-07-06     1  2        NA                 NA
2000-07-07     2  2        1                  NA
2000-07-15     0  2        9                  8
2000-07-16     0  2        10                 9
2000-07-17     2  2        11                 10
2000-07-18     1  2        12                 1

The two events are mutually exclusive, that is, they cannot take place on the same day.


Solution

  • The following uses the Chron Library to calculate difference in the dates

    library(chron)
    
    df$date <- chron(as.character(df$date),format=c(date="y-m-d"))
    
    for(j in unique(df$id)){
      DaysSince1 <-NA
      DaysSince2 <-NA
      RowsWithID <- grep(j,df$id)
    
      for(i in RowsWithID){
        df$days_since_event_1[i] <- df$date[i]-df$date[i-DaysSince1]
        df$days_since_event_2[i] <- df$date[i]-df$date[i-DaysSince2]
    
        if(df$event[i]==1){DaysSince1<-1}
          else{DaysSince1<-DaysSince1+1}
    
        if(df$event[i]==2){DaysSince2<-1}
          else{DaysSince2<-DaysSince2+1}
      }
    }
    

    This code gives the following results

    > df
           date event id days_since_event_1 days_since_event_2
    1  00-07-06     2  1                 NA                 NA
    2  00-07-07     1  1                 NA                  1
    3  00-07-09     0  1                  2                  3
    4  00-07-10     0  1                  3                  4
    5  00-07-15     2  1                  8                  9
    6  00-07-16     1  1                  9                  1
    7  00-07-20     0  1                  4                  5
    8  00-07-21     1  1                  5                  6
    9  00-07-06     1  2                 NA                 NA
    10 00-07-07     2  2                  1                 NA
    11 00-07-15     0  2                  9                  8
    12 00-07-16     0  2                 10                  9
    13 00-07-17     2  2                 11                 10
    14 00-07-18     1  2                 12                  1
    

    To address you comment, you can do the following in Base R to get the number of observations rather than days. No Libraries needed.

    for(j in unique(df$id)){
      ObsSince1 <-NA
      ObsSince2 <-NA
      RowsWithID <- grep(j,df$id)
    
      for(i in RowsWithID){
        df$Obs_since_event_1[i] <- ObsSince1
        df$Obs_since_event_2[i] <- ObsSince2
    
        if(df$event[i]==1){ObsSince1<-1}
        else{ObsSince1<-ObsSince1+1}
    
        if(df$event[i]==2){ObsSince2<-1}
        else{ObsSince2<-ObsSince2+1}
      }
    }
    

    You should get the following output

    > df
             date event id Obs_since_event_1 Obs_since_event_2
    1  2000-07-06     2  1                NA                NA
    2  2000-07-07     1  1                NA                 1
    3  2000-07-09     0  1                 1                 2
    4  2000-07-10     0  1                 2                 3
    5  2000-07-15     2  1                 3                 4
    6  2000-07-16     1  1                 4                 1
    7  2000-07-20     0  1                 1                 2
    8  2000-07-21     1  1                 2                 3
    9  2000-07-06     1  2                NA                NA
    10 2000-07-07     2  2                 1                NA
    11 2000-07-15     0  2                 2                 1
    12 2000-07-16     0  2                 3                 2
    13 2000-07-17     2  2                 4                 3
    14 2000-07-18     1  2                 5                 1