[SOLVED] Replacing NA's within a set of similar values

Replacing NA's within a set of similar values

I have a column in a dataset looking like this:

cluster_id
1
1
1
1
NA
1
NA
NA
2
NA
2
NA
3
NA
NA
3


cluster_id <- c("1","1","1","1","NA","1","NA","NA","2","NA","2","NA","3","NA","NA","3")

The order is already pre-defined before using a time column. What I want is to substitute the NA's that are within each cluster ID, i.e. if there's a row with 2, then an NA, and then a 2 again, I want that NA to become 2. The NA's between numbers stay as NA's. Example:

cluster_id   cluster_id_new
1            1
1            1
1            1
1            1
NA           1
1            1
NA           NA
NA           NA
2            2
NA           2
2            2
NA           NA
3            3
NA           3
NA           3
3            3

I found the zoo::na.locf function in this post, which seems to be close to what I want, but I also need to take in consideration the value after the NA. Any thoughts?

Solution

Update: thanks to @Darren Tsai (who provided a better version):

library(dplyr)
library(tidyr)

tibble(cluster_id) %>% 
  mutate(down = cluster_id, up = cluster_id) %>% 
  fill(down, .direction = "down") %>% 
  fill(up, .direction = "up") %>% 
  mutate(cluster_id_new = if_else(down == up, down, NA)) %>% 
  select(-c(down, up))

Original answer: Here is tidyverse way using mainly fill:

library(dplyr)
library(tidyr)

tibble(cluster_id) %>%
  mutate(helper = row_number(),
         filldown = fill(., cluster_id, .direction = "down")$cluster_id,
         fillup = fill(., cluster_id, .direction = "up")$cluster_id,
         cluster_id_new = ifelse(filldown == fillup, filldown, NA_real_)) %>%
  select(cluster_id, cluster_id_new)

cluster_id cluster_id_new
        <dbl>          <dbl>
 1          1              1
 2          1              1
 3          1              1
 4          1              1
 5         NA              1
 6          1              1
 7         NA             NA
 8         NA             NA
 9          2              2
10         NA              2
11          2              2
12         NA             NA
13          3              3
14         NA              3
15         NA              3
16          3              3