ID <- 1:6
math <- c("YES","NO","YES","NO",NA,NA)
history <- c(NA,NA,"NO","NO","YES",NA)
dt <- data.frame(ID, math, history)
ID math history
1 1 YES <NA>
2 2 NO <NA>
3 3 YES NO
4 4 NO NO
5 5 <NA> YES
6 6 <NA> <NA>
I want to make an additional column ("pass") as follows
If a student ever had "yes" at least once: "YES" (No matter if another subject is missing data or not or so.)
If the student did not get "yes"
so, the columns would like this: (I can do this manually with this minimal example. but not with my real data)
> dt
ID math history pass
1 1 YES <NA> YES
2 2 NO <NA> NO
3 3 YES NO YES
4 4 NO NO NO
5 5 <NA> YES YES
6 6 <NA> <NA> <NA>
I tried to use
dt$pass <- ifelse(rowSums(dt[,-1]=="YES",na.rm=T)>0,"YES","NO")
this code, but it was tricky
because if I put na.rm=TRUE
they consider NA is "NO"
(ID 6 student will be "NO")
if I put na.rm=FALSE
, an only students that have both subject's data are considered.
In my data, I have really lots of columns, not only math and history.
A simple base
solution is
dt$pass <- apply(dt[-1], 1, \(x) sort(x, dec = TRUE)[1])
# > dt
# ID math history pass
# 1 1 YES <NA> YES
# 2 2 NO <NA> NO
# 3 3 YES NO YES
# 4 4 NO NO NO
# 5 5 <NA> YES YES
# 6 6 <NA> <NA> <NA>
Its dplyr
equivalent is
library(dplyr)
dt %>%
rowwise() %>%
mutate(pass = sort(c_across(-1), dec = TRUE)[1]) %>%
ungroup()