rdplyrpurrrnse

how to use strings and listed strings in nested functions in R


I am still trying to improve my understanding how to create a grid table and reference columns either in the grid table or columns in a nested dataset but running into issues with NSE.

I've read the below and tried new things but still couldn't get it to work.

https://dplyr.tidyverse.org/articles/programming.html

Ultimately, I am trying to figure how to use columns from a grid table to identify columns that are nested in a table.


library(tidyverse)

#create tables with columns
sort_tbl <- tibble(sort_col="table")
from_col <- list("x")
delta_col <- list("y","z")


#add columns including arguments column and nest dataset
nested_df <- diamonds %>% 

  group_by(cut) %>% 
  nest() %>% 
    mutate(sort_col="table",
         from_col=list("x"),
         delta_col=list(delta_col),
         arg_col=500
         ) 

#create function that takes input columns from grid table and try to add to dataset
fun <- function(data,from_col,delta_col,sort_col,arg_col) {
  
  data %>% 
    mutate(across(!!ensym(delta_col),
                  ~!!ensym(from_col)-.x)) %>% 
    arrange(desc({sort_col})) %>% 
    mutate(test_col=if_else(price>{arg_col},1,0))
  
}

 #try to add it and but this doesn't work

nested_df %>% 
  mutate(model=pmap(list(data,
                         from_col,
                         delta_col),
                         ~fun(data=data,
                              from_col=from_col,
                              delta_col=delta_col,
                              arg_col=arg_col)
                    )
         )



Solution

  • There are four issues with your code. First, in pmap you have to loop over all five columns used as arguments of your function. Second, doing fun(data=data, ...) your are passing the whole column to your function instead of just one element. Third, delta_col is a list so you have to unlist() it in across. Finally, as sort_col is a a character use the .data pronoun in arrange:

    library(tidyverse)
    
    fun <- function(data, from_col, delta_col, sort_col, arg_col) {
      data %>%
        mutate(across(
          all_of(unlist(delta_col)),
          ~ !!ensym(from_col) - .x
        )) %>%
        arrange(desc(.data[[sort_col]])) %>%
        mutate(test_col = if_else(price > arg_col, 1, 0))
    }
    
    nested_df %>%
      mutate(model = pmap(
        list(
          data,
          from_col,
          delta_col,
          sort_col,
          arg_col
        ),
        fun
      ))
    #> # A tibble: 5 × 7
    #> # Groups:   cut [5]
    #>   cut       data                  sort_col from_col  delta_col  arg_col model   
    #>   <ord>     <list>                <chr>    <list>    <list>       <dbl> <list>  
    #> 1 Ideal     <tibble [21,551 × 9]> table    <chr [1]> <list [2]>     500 <tibble>
    #> 2 Premium   <tibble [13,791 × 9]> table    <chr [1]> <list [2]>     500 <tibble>
    #> 3 Good      <tibble [4,906 × 9]>  table    <chr [1]> <list [2]>     500 <tibble>
    #> 4 Very Good <tibble [12,082 × 9]> table    <chr [1]> <list [2]>     500 <tibble>
    #> 5 Fair      <tibble [1,610 × 9]>  table    <chr [1]> <list [2]>     500 <tibble>