rif-statementtidyverse

Using if_else in a mutate command yields an error about false having the wrong size


Consider the following data:

library(tidyverse)
df <- data.frame(group = rep(letters[1:3], each = 3),
                 x     = 1:9)

I now want to recode values by group based on a check if all values meet a certain threshold.

Using the code below leads to an error

df |> 
  mutate(test = if_else(all(x < 4), 0, x), .by = group)

Error in `mutate()`:
ℹ In argument: `test = if_else(all(x < 4), 0, x)`.
ℹ In group 1: `group = "a"`.
Caused by error in `if_else()`:
! `false` must have size 1, not size 3.
Run `rlang::last_trace()` to see where the error occurred.

However, moving the criterion check out of the if_else command, works as expected.

df |> 
  mutate(helper = all(x < 4),
         test = if_else(helper == TRUE, 0, x), .by = group)

  group x helper test
1     a 1   TRUE    0
2     a 2   TRUE    0
3     a 3   TRUE    0
4     b 4  FALSE    4
5     b 5  FALSE    5
6     b 6  FALSE    6
7     c 7  FALSE    7
8     c 8  FALSE    8
9     c 9  FALSE    9

I have a vague idea why that is, i.e., the TRUE part is just a scalar (0) and the FALSE part in if_else represents all three rows in each group, but want to understand a bit more the issue here and why if_else doesn't recycle the shorter scalar to the length of the false statement.


Solution

  • Since the result of all(x < 4) is either TRUE or FALSE once per group (the reason why dplyr::if_else/data.table::fifelse are not working) using if will recycle 0 or take the vector x (length 3).

    library(dplyr)
    
    df %>% 
      mutate(test = if(all(x < 4)) 0 else x, .by = group)
      group x test
    1     a 1    0
    2     a 2    0
    3     a 3    0
    4     b 4    4
    5     b 3    3
    6     b 6    6
    7     c 7    7
    8     c 8    8
    9     c 9    9
    

    Note that ifelse works but might give the wrong results. It simply recycles the first value per group without complaining.

    Data

    (slightly modified)

    df <- structure(list(group = c("a", "a", "a", "b", "b", "b", "c", "c",
    "c"), x = c(1, 2, 3, 4, 3, 6, 7, 8, 9)), row.names = c(NA, -9L
    ), class = "data.frame")