I would like to be able to apply the "subset" (bracket) [
method on a S4 object let's call it foo
in such a way that when it is called setMethod("[", 'foo', ...
it will apply the [
operator on the data.table it holds in a specific slot.
Example:
foo <- setClass("foo", slots = c(myDT = "data.table"),
prototype = prototype( myDT = NULL ))
# quickly make a foo class with a DT in the myDT slot
myfoo <- new("foo", myDT = data.table(x=rep(c("b","a","c"),each=3), y=c(1,3,6), v=1:9))
# sneak peek
myfoo
An object of class "foo"
Slot "myDT":
x y v
1: b 1 1
2: b 3 2
3: b 6 3
4: a 1 4
5: a 3 5
6: a 6 6
7: c 1 7
8: c 3 8
9: c 6 9
The tricky part
# I want to be able to do eg
myfoo[1:3, 2:3]
y v
1: 1 1
2: 3 2
3: 6 3
and have it give me the same result as if doing:
myfoo@myDT[1:3, 2:3]
y v
1: 1 1
2: 3 2
3: 6 3
So far (I am guessing) it will/should be something along the lines of
setMethod(f = "[", signature = signature(x = "foo"),
definition = function(x, ...) {
`[`(x@.myDT, ...)
# OR maybe
# x <- x@myDT
# callNextMethod(x, ...)
}
)
But whatever I call myfoo[i,j]
with it wll always just return the whole data.table.
Any ideas if this can be accomplished? So far I am stuck usually on errors about j
not fitting the bill.
And I would like to avoid having to fully implement some form of shadow-indexing for this slot if I can somehow "recycle" what is available in data.table
already;
of course also with the added benefit of the other data.table
functions maybe also being applicable this way?
But for a beginning "passing on" indices would be a good start.
PS: If you wonder why not just do myfoo@myDT
- the real life foo
class has multiple slots of which only one (the data.table
one) is "worthy" to be indexed and so I want to "shortcut" that methods application a bit.
Here is a late, not-so-hacky answer:
library(data.table)
setClass("Foo", slots = c(dt = "data.table"), prototype = list(dt = data.table()))
setMethod("[", signature(x = "Foo", i = "ANY", j = "ANY", drop = "ANY"),
function(x, i, j, ..., drop = TRUE) {
if (missing(j))
callGeneric(x@dt, i, , ..., with = TRUE)
else callGeneric(x@dt, i, j, ..., with = FALSE)
})
foo <- new("Foo", dt = data.table(x = letters[1:6], y = 1:6, z = rnorm(6L)))
identical(foo[1:3, 2:3], foo@dt[1:3, 2:3]) # TRUE
This method still does not support the main features of [.data.table
for the reason outlined in this question, namely that i
and j
must be evaluated in addition to x
before multiple dispatch (a feature of S4, not S3) can occur. Hence:
foo@dt[y >= 3L]
## x y z
## 1: c 3 0.02991911
## 2: d 4 -0.36919712
## 3: e 5 -0.03291414
## 4: f 6 -1.02399695
foo[y >= 3L]
## Error in `[.data.table`(x@dt, i, , with = TRUE, ...) :
## i is not found in calling scope and it is not a column name either. When the first argument inside DT[...] is a single symbol (e.g. DT[var]), data.table looks for var in calling scope.
You can still use variables in your environment as index vectors:
ii <- 3:6
foo[ii]
## x y z
## 1: c 3 0.02991911
## 2: d 4 -0.36919712
## 3: e 5 -0.03291414
## 4: f 6 -1.02399695
Anyway, I agree with the comments suggesting that it is often better to implement classes built around data.table
in S3.