[SOLVED] How to subset list of Dfs based on the string content of a specific column

How to subset list of Dfs based on the string content of a specific column - R Language

Imagine I have the followuing list of Data Frames:

df1 <- data.frame (x = c(1, 2, 3), y = c(12, 11, 10), text = c("banana", "avocado", "letuce"))
df2 <- data.frame (x = c(4, 5, "letuce"), y = c(9, 8, 7), text = c("watermelon", "avocado", "grape"))
df3 <- data.frame (x = c(7, 8, 9), y = c(6, 5, 4), text = c("letuce", "apricot", "apple"))
df4 <- data.frame (x = c(10, 11, 12), y = c(3, "letuce", 1), text = c("pineaple", "blueberry", "morango"))

my_list <- list(df1, df2, df3, df4)

How can i keep only the data frames that contains the word "letuce" in the "text" column?

The desired result is this:

subset_list <- list(df1, df3)

I've managed to match the string using this code:

library(tidyverse)
lapply(my_list, with, str_detect(text, "letuce"))

Solution

You can do:

library(tidyverse)
my_list[my_list %>%
          map(.f = ~ any(.x$text == 'letuce')) %>%
          unlist()]

which gives:

[[1]]
  x  y    text
1 1 12  banana
2 2 11 avocado
3 3 10  letuce

[[2]]
  x y    text
1 7 6  letuce
2 8 5 apricot
3 9 4   apple

The solution currently assumes that you want to match whole cases being 'letuce'. If you want to match cases merely containing the word 'letuce', you can do:

my_list[my_list %>%
          map(.f = ~ any(str_detect(.x$text, 'letuce'))) %>%
          unlist()]

Inspired by B.Grothendieck‘s comment (I totally forgot about keep), we could simply do:

my_list %>%
  keep(.p = ~any(str_detect(.x$text, 'letuce')))