rdplyrzoo

Return NA for conditional lead/lagged rolling sum if window outside dataframe or group


I need to get the sum of values from column x either side of every 1 in column y. The windows are 1:4 and 5:8 before and after each occurrence of 1 in y. However, if the windows are beyond the limits of the df, 0 is returned. This is an issue as it is possible that 0 could be a valid result. The full data have multiple groups (id) but hopefully this repex is enough help create a scalable solution.

I have never used zoo before so was unable to work out how to incorporate na.rm = TRUE and return NA where the window is completely outside the extent of the df. Solutions do not have to use zoo, but I would prefer verb-based answers if possible.

Data and packages:

library(dplyr)
library(zoo)

set.seed(1)
df <- data.frame(id = rep("A", 40),
                 x = sample(0:3, 40, replace = TRUE),
                 y = 0)

df[c(2, 8, 9, 30, 33, 39), "y"] <- 1

What I tried:

w <- 4

df %>%
  mutate(bf2 = ifelse(y == 1, rollapply(lag(x, 5), width = w, sum, fill = NA, align = "right", partial = TRUE, na.rm = TRUE), NA),
         bf1 = ifelse(y == 1, rollapply(lag(x, 1), width = w, sum, fill = NA, align = "right", partial = TRUE, na.rm = TRUE), NA),
         af1 = ifelse(y == 1, rollapply(lead(x, 1), width = w, sum, fill = NA, align = "left", partial = TRUE, na.rm = TRUE), NA),
         af2 = ifelse(y == 1, rollapply(lead(x, 5), width = w, sum, fill = NA, align = "left", partial = TRUE, na.rm = TRUE), NA)) %>%
  filter(y == 1)

  id x y bf2 bf1 af1 af2
1  A 3 1   0   0   3   6
2  A 2 1   5   3   6   1
3  A 1 1   5   5   5   2
4  A 1 1   2   1   5   7
5  A 0 1   1   3   8   4
6  A 1 1   5   7   1   0

Desired output:

  id x y bf2 bf1 af1 af2
1  A 3 1  NA   0   3   6
2  A 2 1   5   3   6   1
3  A 1 1   5   5   5   2
4  A 1 1   2   1   5   7
5  A 0 1   1   3   8   4
6  A 1 1   5   7   1  NA

Solution

  • You could always add checks to your conditions:

    df %>%
      mutate(bf2 = ifelse(y & row_number() > 5, rollapply(lag(x, 5), width = w, sum, fill = NA, align = "right", partial = TRUE, na.rm = TRUE), NA),
             bf1 = ifelse(y & row_number() > 1, rollapply(lag(x, 1), width = w, sum, fill = NA, align = "right", partial = TRUE, na.rm = TRUE), NA),
             af1 = ifelse(y & row_number() + 1 <= n(), rollapply(lead(x, 1), width = w, sum, fill = NA, align = "left", partial = TRUE, na.rm = TRUE), NA),
             af2 = ifelse(y & row_number() + 5 <= n(), rollapply(lead(x, 5), width = w, sum, fill = NA, align = "left", partial = TRUE, na.rm = TRUE), NA)) |>
             filter(y == 1)
    

    Output:

      id x y bf2 bf1 af1 af2
    1  A 3 1  NA   0   3   6
    2  A 2 1   5   3   6   1
    3  A 1 1   5   5   5   2
    4  A 1 1   2   1   5   7
    5  A 0 1   1   3   8   4
    6  A 1 1   5   7   1  NA