Let's take this hypothetical code for instance:
```{r}
dataset_custom <- function(top, dataset, variable) {
{{dataset}} %>%
count({{variable}}) %>%
top_n(top, n) %>%
arrange(-n) %>%
left_join({{dataset}}, by = "{{variable}}")
}
```
I know this will return an error when I try to run (say) dataset_custom(5, dataset, variable)
because of the by = "{{variable}}"
in left_join. How do I get around this issue?
I know that when you left join and you want to join it by a particular variable, you do by = "variable"
where variable
has quotations around it, but how do I do it when I write it as a function and I want the stuff in the quotations to change as depending on the input to the function I'm trying to create?
Thank you!
It is useful if you provide some toy data, like the one found in the example of ?left_join
. Note that left_join(df1, df1)
is just df1
. Instead, we can use a 2nd data argument.
df1 <- tibble(x = 1:3, y = c("a", "a", "b"))
df2 <- tibble(x = c(1, 1, 2), z = c("first", "second", "third"))
df1 %>% left_join(df2, by = "x")
f <- function(data, data2, variable) {
var <- deparse(substitute(variable))
data %>%
count({{ variable }}) %>%
arrange(-n) %>%
left_join(data2, by = var)
}
f(df1, df2, x)
x n z
<dbl> <int> <chr>
1 1 1 first
2 1 1 second
3 2 1 third
4 3 1 NA
# and
f(df2, df1, x)
x n y
<dbl> <int> <chr>
1 1 2 a
2 2 1 a
for this to work we need to use defusing operations so that the input is evaluated correctly. Figuratively speaking, using {{ }}
as the by
argument is like using a hammer instead of sandpaper for polishing things - it is a forcing operation where none should happen.