I need to get the sum of values from column x either side of every 1 in column y. The windows are 1:4 and 5:8 before and after each occurrence of 1 in y. However, if the windows are beyond the limits of the df, 0 is returned. This is an issue as it is possible that 0 could be a valid result. The full data have multiple groups (id) but hopefully this repex is enough help create a scalable solution.
I have never used zoo
before so was unable to work out how to incorporate na.rm = TRUE
and return NA where the window is completely outside the extent of the df. Solutions do not have to use zoo
, but I would prefer verb-based answers if possible.
Data and packages:
library(dplyr)
library(zoo)
set.seed(1)
df <- data.frame(id = rep("A", 40),
x = sample(0:3, 40, replace = TRUE),
y = 0)
df[c(2, 8, 9, 30, 33, 39), "y"] <- 1
What I tried:
w <- 4
df %>%
mutate(bf2 = ifelse(y == 1, rollapply(lag(x, 5), width = w, sum, fill = NA, align = "right", partial = TRUE, na.rm = TRUE), NA),
bf1 = ifelse(y == 1, rollapply(lag(x, 1), width = w, sum, fill = NA, align = "right", partial = TRUE, na.rm = TRUE), NA),
af1 = ifelse(y == 1, rollapply(lead(x, 1), width = w, sum, fill = NA, align = "left", partial = TRUE, na.rm = TRUE), NA),
af2 = ifelse(y == 1, rollapply(lead(x, 5), width = w, sum, fill = NA, align = "left", partial = TRUE, na.rm = TRUE), NA)) %>%
filter(y == 1)
id x y bf2 bf1 af1 af2
1 A 3 1 0 0 3 6
2 A 2 1 5 3 6 1
3 A 1 1 5 5 5 2
4 A 1 1 2 1 5 7
5 A 0 1 1 3 8 4
6 A 1 1 5 7 1 0
Desired output:
id x y bf2 bf1 af1 af2
1 A 3 1 NA 0 3 6
2 A 2 1 5 3 6 1
3 A 1 1 5 5 5 2
4 A 1 1 2 1 5 7
5 A 0 1 1 3 8 4
6 A 1 1 5 7 1 NA
You could always add checks to your conditions:
df %>%
mutate(bf2 = ifelse(y & row_number() > 5, rollapply(lag(x, 5), width = w, sum, fill = NA, align = "right", partial = TRUE, na.rm = TRUE), NA),
bf1 = ifelse(y & row_number() > 1, rollapply(lag(x, 1), width = w, sum, fill = NA, align = "right", partial = TRUE, na.rm = TRUE), NA),
af1 = ifelse(y & row_number() + 1 <= n(), rollapply(lead(x, 1), width = w, sum, fill = NA, align = "left", partial = TRUE, na.rm = TRUE), NA),
af2 = ifelse(y & row_number() + 5 <= n(), rollapply(lead(x, 5), width = w, sum, fill = NA, align = "left", partial = TRUE, na.rm = TRUE), NA)) |>
filter(y == 1)
Output:
id x y bf2 bf1 af1 af2
1 A 3 1 NA 0 3 6
2 A 2 1 5 3 6 1
3 A 1 1 5 5 5 2
4 A 1 1 2 1 5 7
5 A 0 1 1 3 8 4
6 A 1 1 5 7 1 NA