Consider the following data:
library(tidyverse)
df <- data.frame(group = rep(letters[1:3], each = 3),
x = 1:9)
I now want to recode values by group based on a check if all values meet a certain threshold.
Using the code below leads to an error
df |>
mutate(test = if_else(all(x < 4), 0, x), .by = group)
Error in `mutate()`:
ℹ In argument: `test = if_else(all(x < 4), 0, x)`.
ℹ In group 1: `group = "a"`.
Caused by error in `if_else()`:
! `false` must have size 1, not size 3.
Run `rlang::last_trace()` to see where the error occurred.
However, moving the criterion check out of the if_else command, works as expected.
df |>
mutate(helper = all(x < 4),
test = if_else(helper == TRUE, 0, x), .by = group)
group x helper test
1 a 1 TRUE 0
2 a 2 TRUE 0
3 a 3 TRUE 0
4 b 4 FALSE 4
5 b 5 FALSE 5
6 b 6 FALSE 6
7 c 7 FALSE 7
8 c 8 FALSE 8
9 c 9 FALSE 9
I have a vague idea why that is, i.e., the TRUE part is just a scalar (0) and the FALSE part in if_else represents all three rows in each group, but want to understand a bit more the issue here and why if_else doesn't recycle the shorter scalar to the length of the false statement.
Since the result of all(x < 4)
is either TRUE
or FALSE
once per group (the reason why dplyr::if_else
/data.table::fifelse
are not working) using if
will recycle 0
or take the vector x
(length 3).
library(dplyr)
df %>%
mutate(test = if(all(x < 4)) 0 else x, .by = group)
group x test
1 a 1 0
2 a 2 0
3 a 3 0
4 b 4 4
5 b 3 3
6 b 6 6
7 c 7 7
8 c 8 8
9 c 9 9
Note that ifelse
works but might give the wrong results. It simply recycles the first value per group without complaining.
(slightly modified)
df <- structure(list(group = c("a", "a", "a", "b", "b", "b", "c", "c",
"c"), x = c(1, 2, 3, 4, 3, 6, 7, 8, 9)), row.names = c(NA, -9L
), class = "data.frame")