I've been searching around this morning to try to figure out if the failure below is expected but haven't found anything. Could anyone help point me to a related discussion? Otherwise, I might submit as an issue. Appreciate it.
library(data.table)
x <- data.table( a = 1:3 )
y <- data.table( a = 2:4 )
z <- data.table( a = 3:5 )
# works
merge( x , y )
# works
merge( y , z )
# fails
merge( x , merge( y , z ) )
# Error in merge.data.table(x, merge(y, z)) :
# A non-empty vector of column names for `by` is required.
# works
merge( merge( x , y ) , z )
This is a clear bug. Please report it. Luckily, it should be easy to fix.
merge.data.table
contains this code:
if (is.null(by))
by = intersect(key(x), key(y))
if (is.null(by))
by = key(x)
if (is.null(by))
by = intersect(names(x), names(y))
Now, the issue is that y
is keyed (because merge.data.table
sets a key):
x <- data.table( a = 1:3 )
y <- merge(data.table( a = 2:4 ), data.table( a = 3:5 ))
haskey(y)
#[1] TRUE
Then,
intersect(key(x), key(y))
#character(0)
Thus, none of the following if
conditions is TRUE (we would want the third one to apply here).
This doesn't happen in your last case because of this:
intersect("foo", NULL)
#NULL
intersect(NULL, "foo")
#character(0)