rapplyrle

Add the results of RLE to the original data frame in R


I have a data frame containing dates and for each date the number of events that took place. From this I add a field telling me if the number of events was above average or not.

Date Events Above Average
01/01 7 0
02/01 8 1
03/01 8 1
04/01 6 0
05/01 8 1
06/01 9 1
07/01 4 0
08/01 7 0

From this, if I perform an RLE I get

Count Value
1 FALSE
2 TRUE
1 FALSE
2 TRUE
2 FALSE

How can I use this information to add an addition field as below to my original data frame:

Date Events Above Average Run Above Av
01/01 7 0 0
02/01 8 1 2
03/01 8 1 2
04/01 6 0 0
05/01 8 1 2
06/01 9 1 2
07/01 4 0 0
08/01 7 0 0

Solution

  • You seem to be looking for the rle lengths, each repeated by itself, then multiplied by the sign of the Above Average column

    library(dplyr)
    
    df %>%
      mutate(`Run Above Av` = rep(rle(`Above Average`)$lengths,
                times = rle(`Above Average`)$lengths) * sign(`Above Average`))
    #>    Date Events Above Average Run Above Av
    #> 1 01/01      7             0            0
    #> 2 02/01      8             1            2
    #> 3 03/01      8             1            2
    #> 4 04/01      6             0            0
    #> 5 05/01      8             1            2
    #> 6 06/01      9             1            2
    #> 7 07/01      4             0            0
    #> 8 08/01      7             0            0
    

    Data from question in reproducible format

    df <- structure(list(Date = c("01/01", "02/01", "03/01", "04/01", "05/01", 
    "06/01", "07/01", "08/01"), Events = c(7L, 8L, 8L, 6L, 8L, 9L, 
    4L, 7L), `Above Average` = c(0L, 1L, 1L, 0L, 1L, 1L, 0L, 0L)), 
    class = "data.frame", row.names = c(NA, -8L))
    

    Created on 2022-06-22 by the reprex package (v2.0.1)