rdplyr

When should .default be used over TRUE in dplyr::case_when


dplyr::case_when has an optional argument, .default, which according to the documentation

.default The value used when all of the LHS inputs return either FALSE or NA.

However, this can also be achieved by setting the last LHS to TRUE.

As such, I am asking when I should use

data %>%
  mutate(test = case_when(
    a == "Foo" ~ "Bar",
    a == "Baz" ~ "Foo"
    TRUE ~ "Other"
  ))

Instead of

data %>%
  mutate(test = case_when(
    a == "Foo" ~ "Bar",
    a == "Baz" ~ "Foo",
    .default = "Other"
  ))

Solution

  • As of now, both options are equivalent. However, the usage of TRUE is not encouraged and meant to be deprecated, due to "unsafe recycling of the LHS inputs". It is still possible to use it, but since dplyr 1.1.0, .default was introduced to replace it. Here is what the changelog says:

    case_when() (#5106):

    Has a new .default argument that is intended to replace usage of TRUE ~ default_value as a more explicit and readable way to specify a default value. In the future, we will deprecate the unsafe recycling of the LHS inputs that allows TRUE ~ to work, so we encourage you to switch to using .default.