rtidyversepurrr

how to extract dataframe name from a list using map


I tried to use 'map' to filter a list of dataset. how can I extract the name of dataframe, which is 'YYY' in my code.

library(dplyr)
library(purrr)

df1 <- data.frame(var1=seq(1:10), var2=seq(1:10), ID=1)
df2 <- data.frame(var2=seq(1:10), ID=2)
df3 <- data.frame(var3=seq(1:10), ID=3)
df_lst <- tibble::lst(df1, df2, df3)
codebook= data.frame(var.name = c("var1", "var2", "var3", "ID"),
                     df.name=c("df1", "df2", "df3", "ALL"))

try <- map(df_lst ~ {
data_sub <- codebook %>% filter(var.name %in% names(.x) & df.name %in% c( YYY, "ALL"))
.x %>% select(data_sub$var.name) 
})

Solution

  • purrr::imap() will pass the name of each element along with the element itself:

    library(purrr)
    library(dplyr)
    
    imap(df_lst, \(df, nm) {
      data_sub <- codebook %>% 
        filter(var.name %in% names(df) & df.name %in% c(nm, "ALL"))
      df %>% select(data_sub$var.name) 
    })
    

    Note:

    1. if you prefer formula-style lambdas (i.e., using ~ instead of \()), the names will be passed as .y.
    2. the "manual" approach would be to iterate over names(df_list) instead of df_list, then access the dataframe within each iteration using df_list[[.x]].

    Result:

    $df1
       var1 ID
    1     1  1
    2     2  1
    3     3  1
    4     4  1
    5     5  1
    6     6  1
    7     7  1
    8     8  1
    9     9  1
    10   10  1
    
    $df2
       var2 ID
    1     1  2
    2     2  2
    3     3  2
    4     4  2
    5     5  2
    6     6  2
    7     7  2
    8     8  2
    9     9  2
    10   10  2
    
    $df3
       var3 ID
    1     1  3
    2     2  3
    3     3  3
    4     4  3
    5     5  3
    6     6  3
    7     7  3
    8     8  3
    9     9  3
    10   10  3