rdataframedplyrzoo

Last observation carried forward and last observation carried backward in R


I have a dataset looking like this

df <- data.frame(ID=c(1,1,1,1,1,2,2,2,3,3), values=c(NA, NA, 12, 13, NA, 5, NA, NA, NA, 1))

I want an output like this, so that last observations are carried forward (by group) unless there are only NA values before one fillied-in value then I want last-observation carried backward:

df <- data.frame(ID=c(1,1,1,1,1,2,2,2,3,3), values=c(12, 12, 12, 13, 13, 5, 5, 5, 1, 1))

I have been working with dplyr and na.locf from the zoo package. SO far my approach has been this:

df%>%
group_by(PID%>%
mutate_all(funs(na.locf(., na.rm = FALSE)))

However, this only does last observation carried forward. The specification "fromLast" in the na.locf function does last observation carried backward.

But how do I connect these two, so that both functions are used:

Thank you so much in advance!


Solution

  • This should work :

    library(tidyverse)
    df <- data.frame(ID=c(1,1,1,1,1,2,2,2,3,3), values=c(NA, NA, 12, 13, NA, 5, NA, NA, NA, 1))
    df2 <- data.frame(ID=c(1,1,1,1,1,2,2,2,3,3), values=c(12, 12, 12, 13, 13, 5, 5, 5, 1, 1))
    
    df <- df %>%
      group_by(ID) %>%
      fill(values, .direction = "downup") %>%
      fill(values, .direction = "updown")