rdplyr

Access specific elements in double curly brackets in dplyr


I have a function that takes as input the data frame columns which I want to summarize on. I would like to arrange the rows by the first column I specify. Can I do this with one argument instead of two?

This is an example of what I would like to do:

foo <- function(sum_col) {
  sales_df %>%
  group_by(styleFam, prepack) %>%
  summarize(across({{sum_col}}, sum)) %>%
    arrange(desc(`by first element in sum_col`))
}

foo(sum_col = c(soldUnits, extPrice))

I don't want to create a second argument for desc:

foo <- function(sum_col, desc_col) {
  sales_df %>%
  group_by(styleFam, prepack) %>%
  summarize(across({{sum_col}}, sum)) %>%
    arrange(desc({{desc_col}}))
}

foo(sum_col = c(soldUnits, extPrice), desc_col = soldUnits)

Solution

  • 1) With built-in CO2 use enquo to get quosure cols. Then use get_expr to get the language object c(uptake, conc) from it and pick off the second element of the language object -- the first element is c and the second is uptake -- giving col1. Finally inject sum_col and col1 into the pipeline using curly curly.

    library(dplyr)
    library(rlang)
    
    foo <- function(sum_col) {
      cols <- enquo(sum_col)
      col1 <- get_expr(cols)[[2]]
      CO2 %>%
        group_by(Treatment, Type) %>%
        summarize(across({{sum_col}}, sum)) %>%
        arrange(desc({{col1}}))
    }
    foo(c(uptake, conc))
    

    giving

    # A tibble: 4 × 4
    # Groups:   Treatment [2]
      Treatment  Type        uptake  conc
      <fct>      <fct>        <dbl> <dbl>
    1 nonchilled Quebec        742   9135
    2 chilled    Quebec        667.  9135
    3 nonchilled Mississippi   545   9135
    4 chilled    Mississippi   332.  9135
    

    2) character vector If we design this to pass a character vector instead then the code simplifies.

    foo2 <- function(sum_col) {
      CO2 %>%
        group_by(Treatment, Type) %>%
        summarize(across(any_of(sum_col), sum)) %>%
        arrange(desc(.data[[first(sum_col)]]))
    }
    foo2(c("uptake", "conc"))
    

    3) formula Another redesign is to accept a formula

    foo3 <- function(sum_col) foo2(all.vars(sum_col))
    foo3(~ uptake + conc)
    

    4) generic or using an S3 generic and methods this one accepts either a character vector or formula.

    foo4 <- function(sum_col) UseMethod("foo4")
    foo4.character <- foo2
    foo4.formula <- function(sum_col) foo4(all.vars(sum_col))
    
    foo4(c("uptake", "conc"))
    foo4(~ uptake + conc)