[SOLVED] Dataframe manipulation in R - selecting certain values

Dataframe manipulation in R - selecting certain values

I have the following issue that I can't find an answer to:

I have a df that looks like this:

  id x1_t1 x1_t2 x1_t3 y1_t1 y1_t2 y1_t3
1  1     0     0     0   100   110   150
2  2     0     0     1   150   140   180
3  3     0     1     1    90   150   160

x1 and y1 are in their multiple times (different time points). Depending on when the x1 variable changes from 0 to 1, I want the respective y1 values, so that the resulting dataframe would look like this:

  id y1_before y1_after
1  1       110      150
2  2       140      180
3  3        90      150

I have already messed around with dplyr quite a bit, but haven't found a working solution. Perhaps you could give me an idea. thanks in advance!!

Solution

1) dplyr Using dat shown reproducibly in the Note at the end, define the x1_* columns as x1 and then y1_* columns as y1 and the index of the match to the first 1 (or the last y1 if none) as i. Then use i to extract the desired values from x1 and y1.

library(dplyr)

dat %>%
  mutate(
    x1 = pick(starts_with("x1")), 
    y1 = pick(starts_with("y1")),
    i = match(1, x1, length(x1)), .by = id) %>%
  summarize(y1_before = y1[[i-1]], y1_after = y1[[i]], .by = id)

giving

  id y1_before y1_after
1  1       110      150
2  2       140      180
3  3        90      150

2) Base R Pretty much the same logic can be used in base R:

do.call("rbind", by(dat, dat$id, \(z) with(z, {
  x1 <- z[grep("x1", names(z))]
  y1 <- z[grep("y1", names(z))]
  i <- match(1, x1, length(x1))
  data.frame(id, y1_before = y1[[i-1]], y1_after = y1[[i]])
})))

Note

dat <- data.frame(
  id = 1:3,
  x1_t1 = integer(3),
  x1_t2 = c(0L, 0L, 1L),
  x1_t3 = c(0L, 1L, 1L),
  y1_t1 = c(100L, 150L, 90L),
  y1_t2 = c(110L, 140L, 150L),
  y1_t3 = c(150L, 180L, 160L)
)