Use mutate and case_when in empty columns

I have 2 cases:

The same column can have the following cases:

1.- Having NA and values.

2.- Having only NA.

I need to make a mutate for these 2 cases in the same column, however, when I try the case of an empty column (only with NA), I get an error.

How can I handle the case of a completely empty column inside mutate, using case_when and consider the other cases?

    case_1 <- tibble("a" = c("a","a", "a","a"),
                     "b" = c(NA,"b", NA,"b"))

    case_2 <- tibble("a" = c("a","a", "a","a"),
                     "b" = c(NA,NA,NA,NA))

    case_1 <- case_1 %>% mutate("a" = case_when(!is.na(a) ~ a,
                                                is.na(a) ~ na.omit(unique(a))),
                                "b" = case_when(!is.na(b) ~ b,
                                                is.na(b) ~ na.omit(unique(b))))

    case_2 <- case_2 %>% mutate("a" = case_when(!is.na(a) ~ a,
                                                is.na(a) ~ na.omit(unique(a))),
                                "b" = case_when(!is.na(b) ~ b,
                                                is.na(b) ~ na.omit(unique(b))))

The first case (case_1) works fine, the case_2 have problems.

Solution

It looks as if you're trying to fill any non-NA values both forwards and backwards. You can achieve what you're trying to do here with tidyr::fill(), ensuring you specify the .direction:

case_1 |>
    tidyr::fill(b, .direction = "downup")

#   a     b
#   <chr> <chr>
# 1 a     b
# 2 a     b
# 3 a     b
# 4 a     b

case_2 |>
    tidyr::fill(b, .direction = "downup")

#   a     b
#   <chr> <lgl>
# 1 a     NA
# 2 a     NA
# 3 a     NA
# 4 a     NA

However, in the general case that you want to use dplyr::case_when() and all the values in b might be NA, you can't have na.omit(unique(b)) on the right-hand side, as it creates a zero-length vector. This is the wrong length to be assigned to a data frame column greater than length zero, and will not (cannot) be recycled into a vector of the correct length. case_when() evaluates all the possible return vectors, even if they are never assigned. For example:

case_2 |>
    mutate(
        "b" = case_when(
            TRUE ~ NA,
            FALSE ~ na.omit(unique(b))
        )
    )
# Error in `mutate()`:
# ℹ In argument: `b = case_when(TRUE ~ NA, FALSE ~ na.omit(unique(b)))`.
# Caused by error:
# ! `b` must be size 4 or 1, not 0.
# Run `rlang::last_trace()` to see where the error occurred.

Clearly, FALSE can never be TRUE so we'll never need to assign na.omit(unique(b)). However, case_when() describes itself as general vectorised if-else. The dplyr::if_else() is stricter than base::ifelse(), with various safety checks for type and length, which are what throw this error. You would see the same error with b = if_else(TRUE, NA, na.omit(unique(b))), but not with b = ifelse(TRUE, NA, na.omit(unique(b))).

if() also does not evaluate the FALSE branch if the TRUE branch is evaluated. So, you can keep your case_when() and introduce an if() statement in your pipe to just return NA if all values are NA, and never evaluate the case_when():

case_2 |> mutate(
    "a" = case_when(
        !is.na(a) ~ a,
        is.na(a) ~ na.omit(unique(a))
    ),
    "b" = if (all(is.na(b))) NA else case_when(
            !is.na(b) ~ b,
            is.na(b) ~ na.omit(unique(b))
        )
)

#   a     b
#   <chr> <lgl>
# 1 a     NA
# 2 a     NA
# 3 a     NA
# 4 a     NA