I've a time series with two columns, one column contains a "signal" either NA or not (then it's an integer between 1 and 5, I don't care about the actual value unless it differs from NA) and the second column contains the actual value.
I need to calculate the highest value since the signal was not NA. This is shown in the example below.
Date Sig Val Expected result
2008-01-01 1 47 47 <<- Sig==1, i. e. here we start counting
2008-01-02 NA 31 47
2008-01-03 NA 61 61 <<- 61 is higher than 47, so this one is important now
2008-01-04 NA 43 61
2008-01-05 NA 23 61
2008-01-06 NA 46 61
2008-01-07 NA 17 61
2008-01-08 NA 52 61
2008-01-09 NA 84 84 <<- a new high, value should be kept
2008-01-10 NA 54 84
2008-01-11 1 30 30 <<- a new signal, here we start counting again
2008-01-12 NA 36 36 <<- a new higher value in this segment
2008-01-13 NA 59 59 <<- again a new higher value in this segment
2008-01-14 NA 56 59
2008-01-15 NA 15 59
2008-01-16 NA 21 59
2008-01-17 NA 87 87
2008-01-18 NA 81 87
2008-01-19 2 94 94 <<- a new signal this time a 2, so here we start counting again
2008-01-20 NA 42 94
2008-01-21 NA 95 95
2008-01-22 1 42 42 <<- a new signal, here we start counting again
2008-01-23 NA 25 42
2008-01-24 NA 20 42
2008-01-25 NA 76 76
2008-01-26 NA 95 95
2008-01-27 NA 14 95
2008-01-28 NA 12 95
2008-01-29 NA 13 95
2008-01-30 NA 57 95
2008-01-31 NA 26 95
The problem is that the relevant observation window is dynamic, depending in signal-column.
I've played around with the runMax(), cummax() and rollapply() functions but they only work on a specific window length ... I guess I can't see the wood for the trees, but I can't figure out how to make the "look-back" window dynamic. Any guess?
Create a column that creates the signal groups, then use cummax
Using data.table
(assuming your data is in a data.frame
df)
library(data.table)
D <- as.data.table(df)
D[, list(maxvalue = cummax(value)),
by = list(sig2 = cumsum(replace(sig, is.na(sig), 0)))]