I have a data frame where some of the column names are non-syntactic. I want to select a range of these columns, but I am trying to do so inside a function. That is, the non-syntactic names which I want to select on are provided as variables to the function.
Here is an example:
library(dplyr)
df <- tibble(a = "a", b = "b", `1` = "c", `2` = "d")
f <- function(df) {
select(df, `1`:`2`)
}
g <- function(df, var) {
select(df, all_of(var):`2`)
}
The output is:
> f(df)
# A tibble: 1 x 2
`1` `2`
<chr> <chr>
1 c d
> g(df, var = 1)
# A tibble: 1 x 4
a b `1` `2`
<chr> <chr> <chr> <chr>
1 a b c d
> g(df, var = `1`)
Error in `select()`:
i In argument: `all_of(var)`.
Caused by error:
! object '1' not found
Run `rlang::last_trace()` to see where the error occurred.
I am trying to implement the functionality of g
, but I want the output of f
(in this specific example).
It seems that by all_of(var)
I am referencing the column index; I would rather make a reference to the non-syntactic name of the third column of the data frame. How can I reference a non-syntactic name in a select
-call when that name is stored as a variable?
The issue is not related to the use of a non-syntactic column name, i.e. you will get same error when you do:
library(dplyr, warn = FALSE)
df <- tibble(a = "a", b = "b", `1` = "c", `2` = "d")
g <- function(df, var) {
select(df, all_of(var):`2`)
}
g(df, var = b)
#> Error in `select()`:
#> ℹ In argument: `all_of(var)`.
#> Caused by error:
#> ! object 'b' not found
The issue is simply that when you want to pass an unquoted column name to a function you have to take care of that using an quote-and-unquote pattern which can be simplified in one step using curly-curly aka {{
(see here and here), i.e. you can do:
g <- function(df, var) {
select(df, {{ var }}:`2`)
}
g(df, var = `1`)
#> # A tibble: 1 × 2
#> `1` `2`
#> <chr> <chr>
#> 1 c d
The only special thing about non-syntactic column names is that we have to wrap them inside backticks.