I've got a database query running from R which can have multiple columns filled with NA, some of these columns I need further down the line but one specific column can be dropped if all values are NA.
I usually use purrr::discard(~all(is.na(.))) to drop the columns which are all NA, but since this dataframe can contain multiple columns with NA where I only want to remove one I'm struggling to make this specific to the column in a tidyverse solution.
I've currently got this workaround:
if(sum(is.na(Orders$Originator)) == nrow(Orders)) {
Orders <- Orders %>%
select(-Originator)
}
But it would improve the readability if I can have this in a tidyverse solution. Hope someone can be of help!
Thanks!
Using example data:
df <- data.frame(
x = c(1,2,NA),
y = NA,
z = c(3,4,5)
)
Here column y
is the target column to check if all
is.na
. Your if
and else
will be contained in curly braces. The braces will suppress the pipe from using the first argument in a function. Note the else
will keep your data frame in the pipe if the condition is false.
library(tidyverse)
df %>%
{ if (all(is.na(.$y))) select(., -y) else . }
Output
x z
1 1 3
2 2 4
3 NA 5