rdplyrtidyverse

dplyr: find if a column value is substring of any item in a fixed list and mutate value


I started with a simple example which works, but I don't know how to use within dplyr mutate

This works, I get "bcd-234":

library(tidyverse)
list = c("abc-123", 'bcd-234', 'cde-345', 'bcd-987')
s = 'bcd'
list[str_detect(list, s)][1]

but when I try to use it within dplyr mutate I get an error:

df <- tibble(
  name = c('aaa', 'bbb', 'ccc', 'ddd', 'abc', 'bcd', 'cde')
)


df |> mutate(
  new_name = if_else(any(str_detect(list, name)), list[str_detect(list, name)][1], name)
  )

I get an error:

! Can't recycle `string` (size 4) to match `pattern` (size 7).

Thanks


Solution

  • res <- df |>
      rowwise() |>
      mutate(
        new_name = if (any(str_detect(list, name))) list(list[str_detect(list, name)]) else list(name)
      )
    
    > res
    #   name  new_name 
    #   <chr> <list>   
    # 1 aaa   <chr [1]>
    # 2 bbb   <chr [1]>
    # 3 ccc   <chr [1]>
    # 4 ddd   <chr [1]>
    # 5 abc   <chr [1]>
    # 6 bcd   <chr [2]>
    # 7 cde   <chr [1]>
    
    > as.data.frame(res)
      name         new_name
    1  aaa              aaa
    2  bbb              bbb
    3  ccc              ccc
    4  ddd              ddd
    5  abc          abc-123
    6  bcd bcd-234, bcd-987
    7  cde          cde-345
    

    PS. avoid using list as a variable name.