rdataframedplyr

Get TRUE for each row whenever any of specific column values exist in an external vector


I am trying to create a new column on a dataframe, where TRUE or 'Yes' is specified if any of the values of a row across specific columns exist in an external vector.

vector<-c('a', 'f', 'm')

df

COL1    SR3   SR_op   letter  SR_2
12       y     f       ab      m
76       e     r       cd      t
90       a     b       jk      c 
40       z     f       fg      4
34       u     v       xy      w

I was trying this:

library(dplyr)

df %>% mutate(across(.cols= starts_with('SR'), ~ case_when(. %in% vector ~ 'Yes')))

I would like to get the following:

COL1    SR3   SR_op   letter  SR_2  in_vector
12       y     f       ab      m      Yes
76       e     r       cd      t      NA
90       a     b       jk      c      Yes
40       z     f       fg      4      Yes
34       u     v       xy      w      NA

Solution

  • The function you are looking for is if_any:

    library(dplyr)
    df |> 
      mutate(in_vector = if_any(starts_with('SR'), ~ .x %in% vector))
    
    #   COL1 SR3 SR_op letter SR_2 in_vector
    # 1   12   y     f     ab    m      TRUE
    # 2   76   e     r     cd    t     FALSE
    # 3   90   a     b     jk    c      TRUE
    # 4   40   z     f     fg    4      TRUE
    # 5   34   u     v     xy    w     FALSE