rinterpolationpanel

How to impute a value if value has only one occasion in a sequence


Basically, I want to interpolate, using R, when a value is different for only one occasion (in this case a month). Let's say my data looks like this:

    df <- data.frame(id = c(1,1,1,1,1,1,2,2,2,2,2,2), 
      month = c(1,2,3,4,5,6,1,2,3,4,5,6), 
      status = c('a', 'a', 'b', 'a', 'b', 'b', 'a', 'a', 'b', 'b', 'b', 'b'))

How do I interpolate, within ids, when a status only occurs once within a sequence. In my example, I want to change the first b (row 3) into a, but none of the others need changing.


Solution

  • I am not sure if you want to play a sliding window to replace the "outlier" by its neighbors. If so, maybe the code below should fit your goal (assuming the sliding window moves from left to right)

    f <- \(x) {
      l <- length(x)
      if (x[2]==x[3] && x[2]!=x[1]) {
        x[1] <- x[2]
      }
      if (x[l-1]==x[l-2] && x[l-1]!=x[l]) {
        x[l] <- x[l-1]
      }
      for (i in 2:(length(x) - 1)) {
        if (x[i + 1] == x[i - 1] && x[i] != x[i + 1]) {
          x[i] <- x[i + 1]
        }
      }
      x
    }
    

    then you will obtain

    > f(c("a", "a", "b", "a", "b", "a", "b", "b"))
    [1] "a" "a" "a" "a" "a" "a" "b" "b"
    
    > f(c("a", "b", "b"))
    [1] "b" "b" "b"
    
    > f(c("b", "b", "a"))
    [1] "b" "b" "b"
    
    > f(c("a", "b", "c", "d", "c", "a", "b"))
    [1] "a" "b" "c" "c" "c" "a" "b"