rtidyverse

Why does this create an "NAs introduced by coercion" warning


Curious why the following produces an "NAs introduced by coercion" warning

# Example dataframe
df <- tibble(
  session = c("a",2),
)

df %>%
  mutate(sessionNum = case_when(
    session == "a" ~ 1,
    TRUE ~ as.numeric(session)
  ))

I thought there'd be no need for coercing anything into an "NA" as "a" is covered by the first case_when.

Even the following dataframe produces the warning!

# Example dataframe
df <- tibble(
  session = c("a"),
)

Solution

  • It seems that dplyr's if_else() and case_when() evaluate the RHS regardless if the condition is TRUE or not. According to Hadley Wickham, this is needed for type stability.

    https://github.com/tidyverse/dplyr/issues/5321

    Since the RHS is being evaluated for all values you are coercing a character type into a numeric type (i.e., as.numeric(c('a', '2'))) resulting in the warning.

    Also noted here: https://github.com/tidyverse/dplyr/issues/5341