rna

NA omit only select rows that have NA's


In the example below, if I do na.omit on df, then it omits rows where d has a value of 10 and 12.

Is there a way to do an na.omit with a condition that both column c and d have to have an NA for the entire row to be omitted? That was I will still have 2 NA's in the c column because a, b and d has data for it.

a <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
b <- c(2, 4, 6, 8, 10, 12, 14, 16, 18, 20)
c <- c(0.1, 3, 5, NA, NA, NA, NA, NA, NA, NA)
d <- c(3, 6, 9, 10, 12, NA, NA, NA, NA, NA)

df <- as.data.frame(cbind(a, b, c, d))

df
na.omit(df)

Solution

  • Here is a base R way to do this by using the [ subsetting function and a logical statement. na.omit() is working exactly the way it is designed and described in the documentation: "na.omit returns the object with incomplete cases removed."

    
    a <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
    b <- c(2, 4, 6, 8, 10, 12, 14, 16, 18, 20)
    c <- c(0.1, 3, 5, NA, NA, NA, NA, NA, NA, NA)
    d <- c(3, 6, 9, 10, 12, NA, NA, NA, NA, NA)
    
    df <- data.frame(a, b, c, d)
    df[!(is.na(df$c) & is.na(df$d)),]