rlaglocf

How can I fill the lag with leading non-zero values?


I can lag a vector like this, keeping/dublicating the initial value '1':

x <- c(1,0,0,0,0)
zoo::na.locf(lag(x,1), fromLast = T)
[1] 1 1 0 0 0

But if there is a leading '0' in the x vector, the following '1' is not repeated:

x <- c(0,1,0,0,0)
zoo::na.locf(lag(x,1), fromLast = T)
[1] 0 0 1 0 0

How can I repeat/keep the '1' if it is not leading? The result should look like this:

0 1 1 0 0

In other words: I want to fill the gap/lag between the original position of the first non-zero value and its lagged position with the first non-zero value. Here is a little more complex example:

c(0,0,0,1,2,3,0,0,0,0)

should become

c(0,0,0,1,1,2,3,0,0,0) 

by lag 1

and

c(0,0,0,1,1,1,1,2,3,0) 

by lag 3


Solution

  • Extract out the leading 0s, lag() the remaining vector with the first value as default, then add back the leading 0s. Wrapped in a helper function:

    library(dplyr)
    
    my_lag <- function(x, n = 1L) {
      if (all(x == 0)) return(x)
      x_0 <- x[cumall(x == 0)]
      x_1 <- x[!cumall(x == 0)]
      c(x_0, lag(x_1, n = n, default = x_1[[1]]))
    }
    

    Results with your example vectors:

    x1 <- c(1,0,0,0,0)
    x2 <- c(0,1,0,0,0)
    x3 <- c(0,0,0,1,2,3,0,0,0,0)
    
    my_lag(x1)
    # 1 1 0 0 0 
    
    my_lag(x2)
    # 0 1 1 0 0
    
    my_lag(x3)
    # 0 0 0 1 1 2 3 0 0 0
    
    my_lag(x3, 3)
    # 0 0 0 1 1 1 1 2 3 0