rdplyrdata.table

data.table two‐dot pronoun (..) in i or tidyverse bang bang !! equivalent in data.table


I'm trying to filter a data.table by comparing a column to an external R variable using the "two‐dot" pronoun (..), but I keep getting

Error in `[.data.table`(dt, reg == ..reg) : object '..reg' not found even though:

I’m on data.table v1.14.8

is.data.table(dt) returns TRUE

I’ve loaded data.table after dplyr, so I believe I’m dispatching the right [ method.

Reproducible example:

library(data.table)
# Confirm version
packageVersion("data.table")   # ‘1.14.8’

# Sample data
reg <- 7
dt <- data.table(
  g   = 1:10,
  reg = rep(5:8, length.out=10)
)

# This works for j:
dt[, region := ..reg]

# But this fails for i:
dt[ reg == ..reg ]
#> Error in `[.data.table`(dt, reg == ..reg) : object '..reg' not found

What am I missing? Should .. work in the i position as well as j? Are there any namespace or search‐path issues I should check? Any pointers to the exact documentation/vignette that shows .. being used in i would be appreciated.

In general I want to write something equivalent to the following tidyverse script

dt |> filter(reg == !!reg)

I have read the documention in rdata.table official documentation but it is not clear if dot-dot .. can be used in i as well.


Solution

  • If you are locked into using data.table_1.14.8, I think a clear path is to use get(.) with a specific environment, as in

    library(data.table)
    packageVersion("data.table")
    # [1] ‘1.14.8’
    reg <- 7
    dt <- data.table(g = 1:10, reg = rep(5:8, length.out=10))
    thisenv <- environment()
    dt[reg == get('reg', envir = thisenv)]
    #        g   reg
    #    <int> <int>
    # 1:     3     7
    # 2:     7     7
    

    The reason for the indirection there is that hard-coding envir=.GlobalEnv or envir=globalenv() locks you into exactly one flow of caller/callee, and any deviation is quite possible to not work as intended. It is unmabiguous, you want to use the object named 'reg' that is visible in thisenv, none others.

    (If you are always using this interactively in the global environment, then you can cheat with dt[reg == get('reg', envir = .GlobalEnv)], though don't get too comfortable with it :-)

    If you can update to at least 1.15.0, the Programming on data.table article demonstrates the new env= argument, which would facilitate an approach as

    library(data.table)
    packageVersion('data.table')
    # [1] ‘1.17.0’
    reg <- 7
    dt <- data.table(g = 1:10, reg = rep(5:8, length.out=10))
    dt[reg == r, env = list(r = reg)]
    #        g   reg
    #    <int> <int>
    # 1:     3     7
    # 2:     7     7
    

    where you can name it r or whatever you want (except for names found in .SD :-).