I have a variable with the same name as a column in a dataframe:
df <- data.frame(a=c(1,2,3), b=c(4,5,6))
b <- 5
I want to get the rows where df$b == b
, but dplyr interprets this as df$b == df$b
:
df %>% filter(b == b) # interpreted as df$b == df$b
# a b
# 1 1 4
# 2 2 5
# 3 3 6
If I change the variable name, it works:
B <- 5
df %>% filter(b == B) # interpreted as df$b == B
# a b
# 1 2 5
I'm wondering if there is a better way to tell filter
that b
refers to an outside variable.
Recently I have found this to be an elegant solution to this problem, although I'm just starting to wrap my head around how it works.
df %>% filter(b == !!b)
which is syntactic sugar for
df %>% filter(b == UQ(b))
A high-level sense of this is that the UQ
(un-quote) operation causes its contents to be evaluated before the filter operation, so that it's not evaluated within the data.frame.
This is described in this chapter of Advanced R, on 'quasi-quotation'. This chapter also includes a few solutions to similar problems related to non-standard evaluation (NSE).