rgrepl

grepl in multiple columns in R


I'm trying to do a string search and replace across multiple columns in R. My code:

# Get columns of interest
selected_columns <- c(368,370,372,374,376,378,380,382,384,386,388,390,392,394)

#Perform grepl across multiple columns
df[,selected_columns][grepl('apples',df[,selected_columns],ignore.case = TRUE)] <- 'category1'

However, I'm getting the error:

Error: undefined columns selected

Solution

  • grep/grepl works on vectors/matrix and not on data.frame/list. According to the?grep`

    x - a character vector where matches are sought, or an object which can be coerced by as.character to a character vector.

    We can loop over the columns (lapply) and replace the values based on the match

    df[, selected_columns] <- lapply(df[, selected_columns],
         function(x) replace(x, grepl('apples', x, ignore.case = TRUE), 'category1'))
    

    Or with dplyr

    library(dplyr)
    library(stringr)
    df %>%
         mutate_at(selected_columns, ~ replace(., str_detect(., 'apples'), 'category1'))