I have a function that takes as input the data frame columns which I want to summarize on. I would like to arrange the rows by the first column I specify. Can I do this with one argument instead of two?
This is an example of what I would like to do:
foo <- function(sum_col) {
sales_df %>%
group_by(styleFam, prepack) %>%
summarize(across({{sum_col}}, sum)) %>%
arrange(desc(`by first element in sum_col`))
}
foo(sum_col = c(soldUnits, extPrice))
I don't want to create a second argument for desc:
foo <- function(sum_col, desc_col) {
sales_df %>%
group_by(styleFam, prepack) %>%
summarize(across({{sum_col}}, sum)) %>%
arrange(desc({{desc_col}}))
}
foo(sum_col = c(soldUnits, extPrice), desc_col = soldUnits)
1) With built-in CO2
use enquo
to get quosure cols
. Then use get_expr
to get the language object c(uptake, conc)
from it and pick off the second element of the language object -- the first element is c
and the second is uptake
-- giving col1
. Finally inject sum_col
and col1
into the pipeline using curly curly.
library(dplyr)
library(rlang)
foo <- function(sum_col) {
cols <- enquo(sum_col)
col1 <- get_expr(cols)[[2]]
CO2 %>%
group_by(Treatment, Type) %>%
summarize(across({{sum_col}}, sum)) %>%
arrange(desc({{col1}}))
}
foo(c(uptake, conc))
giving
# A tibble: 4 × 4
# Groups: Treatment [2]
Treatment Type uptake conc
<fct> <fct> <dbl> <dbl>
1 nonchilled Quebec 742 9135
2 chilled Quebec 667. 9135
3 nonchilled Mississippi 545 9135
4 chilled Mississippi 332. 9135
2) character vector If we design this to pass a character vector instead then the code simplifies.
foo2 <- function(sum_col) {
CO2 %>%
group_by(Treatment, Type) %>%
summarize(across(any_of(sum_col), sum)) %>%
arrange(desc(.data[[first(sum_col)]]))
}
foo2(c("uptake", "conc"))
3) formula Another redesign is to accept a formula
foo3 <- function(sum_col) foo2(all.vars(sum_col))
foo3(~ uptake + conc)
4) generic or using an S3 generic and methods this one accepts either a character vector or formula.
foo4 <- function(sum_col) UseMethod("foo4")
foo4.character <- foo2
foo4.formula <- function(sum_col) foo4(all.vars(sum_col))
foo4(c("uptake", "conc"))
foo4(~ uptake + conc)