Imagine I have the followuing list of Data Frames:
df1 <- data.frame (x = c(1, 2, 3), y = c(12, 11, 10), text = c("banana", "avocado", "letuce"))
df2 <- data.frame (x = c(4, 5, "letuce"), y = c(9, 8, 7), text = c("watermelon", "avocado", "grape"))
df3 <- data.frame (x = c(7, 8, 9), y = c(6, 5, 4), text = c("letuce", "apricot", "apple"))
df4 <- data.frame (x = c(10, 11, 12), y = c(3, "letuce", 1), text = c("pineaple", "blueberry", "morango"))
my_list <- list(df1, df2, df3, df4)
How can i keep only the data frames that contains the word "letuce" in the "text" column?
The desired result is this:
subset_list <- list(df1, df3)
I've managed to match the string using this code:
library(tidyverse)
lapply(my_list, with, str_detect(text, "letuce"))
You can do:
library(tidyverse)
my_list[my_list %>%
map(.f = ~ any(.x$text == 'letuce')) %>%
unlist()]
which gives:
[[1]]
x y text
1 1 12 banana
2 2 11 avocado
3 3 10 letuce
[[2]]
x y text
1 7 6 letuce
2 8 5 apricot
3 9 4 apple
The solution currently assumes that you want to match whole cases being 'letuce'. If you want to match cases merely containing the word 'letuce', you can do:
my_list[my_list %>%
map(.f = ~ any(str_detect(.x$text, 'letuce'))) %>%
unlist()]
Inspired by B.Grothendieck‘s comment (I totally forgot about keep
), we could simply do:
my_list %>%
keep(.p = ~any(str_detect(.x$text, 'letuce')))