rdplyrna

Count the number of non-NA numeric values of each row in dplyr


I create a dataframe df.

df <- data.frame (id = 1:10, 
    var1 = 10:19,
    var2 = sample(c(1:2,NA), 10, replace=T),
    var3 = sample(c(3:5, NA), 10, replace=T))

What I need is a new column var4, which count the number of non-NA values of each row (excluding the id column). So for example, if a row is like var1=19, var2=1, var3=NA, then var4=2. I could not find a good way to do this in dplyr. something like:

df %in% mutate(var4= ... )

I appreciate if anyone can help me with that.


Solution

  • Use select + is.na + rowSums, select(., -id) returns the original data frame (.) with id excluded, and then count number of non-NA values with rowSums(!is.na(...)):

    df %>% mutate(var4 = rowSums(!is.na(select(., -id))))
    
    #   id var1 var2 var3 var4
    #1   1   10   NA    4    2
    #2   2   11    1   NA    2
    #3   3   12    2    5    3
    #4   4   13    2   NA    2
    #5   5   14    1   NA    2
    #6   6   15    1   NA    2
    #7   7   16    1    5    3
    #8   8   17   NA    4    2
    #9   9   18   NA    4    2
    #10 10   19   NA   NA    1