rr-s4

Forward all arguments passed to the `[` subset method to another object in R


In R, I have an S4 class that "wraps" around a vector. I want to forward many of the normal S3 and S4 methods on this class to its vec slot, which contains the vector. However I'm having particular trouble with the [ method, because of its special properties. Edit: I want to forward all arguments to [ including i, j, drop etc to this inner vector, exactly as they are passed to the outer wrapper object.

Here's a simple example of my code and what I've tried. Firstly, we start by defining this simple class and instantiating it:

> Foo = setClass("Foo", slots=c(vec = "numeric"))
> foo = Foo(vec=c(1, 2, 3))
> foo
An object of class "Foo"
Slot "vec":
[1] 1 2 3

All good so far, next we try to forward all arguments to [ using the normal ... dots:

> setMethod("[", signature=c(x="Foo"), function(x, ...){
+   print(list(...))
+   x@vec[...]
+ })
> foo[2]
list()
[1] 1 2 3

Clearly this doesn't work because it's not subsetting the vector at all. But not only that, the print statement shows that ... are not capturing anything, even though i=2 is being passed here. So I give up on using ....

Next I try explicitly using i and j:

> setMethod("[", signature=c(x="Foo"), function(x, i, j, ..., drop = TRUE){
+   x@vec[i, j, drop=drop]
+ })
> foo[2]
Error in x@vec[i, j, drop = drop] : incorrect number of dimensions

This doesn't work because I'm always passing in j, but R doesn't have a mechanism to selectively pass in arguments, so I'm a bit stuck here.

Next I try match.call:

> setMethod("[", signature=c(x="Foo"), function(x, ...){
+   call = as.list(match.call())
+   fun = call[[1]]
+   str(fun)
+   args = call[-1]
+   str(args)
+   do.call(fun, args)
+ })
> foo[2]
 symbol [
List of 2
 $ x: symbol foo
 $ i: num 2
Error in do.call(fun, args) : 
  'what' must be a function or character string

This is interesting, because it proves that match.call does capture extra arguments where ... did not, but because x and [ are captured as symbols and not as their real values, it's doesn't work this easily, and it's not clear how to resolve them.

How then can I forward all arguments passed to the [ method to another object?


Solution

  • Many of these issues are understood with a careful reading of ?setMethod. I'll address your suggestions one by one, then add some of my own.

    1)

    setMethod("[", signature = c(x = "Foo"), 
              function(x, ...) { 
                  print(list(...))
                  x@vec[...]
              })
    

    The formal arguments of the method do not match those of the generic function. setMethod corrects them automatically, without warning (sigh). Hence your actual method looks like this:

    getMethod("[", signature = c(x = "Foo"))
    
    Method Definition:
    
    function (x, i, j, ..., drop = TRUE) 
    {
        print(list(...))
        x@vec[...]
    }
    
    Signatures:
            x    
    target  "Foo"
    defined "Foo"
    

    When we call foo[2], we find that x matches foo, i matches 2, and ... matches nothing. Hence list(...) is empty and x@vec[...] evaluates to x@vec.

    2)

    setMethod("[", signature = c(x = "Foo"), 
              function(x, i, j, ..., drop = TRUE) {
                  x@vec[i, j, drop = drop]
              })
    

    The formal arguments of this method are fine, but you are indexing a vector as an array, which is an error. You could ignore everything except i and simply return x@vec[i], but there are still traps there (e.g., i could be missing).

    3)

    setMethod("[", signature = c(x = "Foo"), 
              function(x, ...) {
                  call <- as.list(match.call())
                  fun <- call[[1]]
                  str(fun)
                  args <- call[-1]
                  str(args)
                  do.call(fun, args)
              })
    

    This one is wrong because (again) the formal arguments don't match those of the generic function, and for the reason you point out: fun is a symbol and args is a list of language objects (and maybe constants), which is not what do.call expects. In any case, my opinion is that approaches based on match.call are too complicated and fragile.

    How I would do it

    Your definition of class Foo implies that foo@vec does not have a dim attribute. It is always a numeric vector and never an array.

    Foo <- setClass("Foo", slots = c(vec = "numeric"))
    foo.v <- Foo(vec = c(1, 2, 3))
    foo.a <- Foo(vec = toeplitz(1:6))
    ## Error in validObject(.Object) : 
    ##   invalid class "Foo" object: invalid object for slot "vec" in class "Foo": got class "matrix", should be or extend class "numeric"
    

    For this vector-like class Foo, I would define:

    setMethod("[", signature = c(x = "Foo", i = "ANY", j = "missing", drop = "missing"), 
              function(x, i, j, ..., drop = TRUE) {
                  if (nargs() > 2L)
                      stop("'x' of class Foo must be indexed as x[i]")
                  else if (missing(i)) 
                      x@vec 
                  else x@vec[i]
              })
    

    This allows x[] or x[i] while prohibiting all of x[drop=], x[i, ], x[i, , drop=], x[i, j], and x[i, j, drop=], which do not make sense for vector-like classes.

    foo.v[]
    ## [1] 1 2 3
    
    foo.v[-2L]
    ## [1] 1 3
    
    foo.v[1L, 1L, drop = FALSE]
    ## Error in foo.v[1L, 1L, drop = FALSE] : object of type 'S4' is not subsettable
    

    That error is a bit opaque. It arises because our method is not dispatched for the signature in the call. In practice, you would catch such cases with additional methods throwing more transparent errors.

    Now let's consider a second class Bar similar to Foo, but whose vec slot can be a numeric vector or any array.

    setClassUnion("vectorOrArray", c("numeric", "array"))
    Bar <- setClass("Bar", slots = c(vec = "vectorOrArray"))
    bar.v <- Bar(vec = c(1, 2, 3))
    bar.a <- Bar(vec = toeplitz(1:6))
    

    For this vector- or array-like class Bar, I would define:

    setMethod("[", signature = c(x = "Bar", i = "ANY", j = "ANY", drop = "ANY"), 
              function(x, i, j, ..., drop = TRUE) {
                  x <- x@vec
                  callGeneric()
              })
    

    where callGeneric implements what you were trying to implement with match.call and eval but in a more robust way.

    bar.v[]
    ## [1] 1 2 3
    
    bar.v[-2L]
    ## [1] 1 3
    
    bar.v[1L, 1L, drop = FALSE]
    ## Error in x[i = i, j = j, drop = drop] : incorrect number of dimensions
    
    bar.a[1L, 1L, drop = FALSE]
    ##      [,1]
    ## [1,]    1
    

    One could also define a method for Foo based on callGeneric, but for a purely vector-like class, the overhead of callGeneric might be a barrier. The earlier method using just nargs and missing is much faster, and we often want methods for [ to be tightly optimized, since they tend to be called in loops.

    microbenchmark::microbenchmark(foo.v[-2L], bar.v[-2L], times = 1000L)
    ## Unit: nanoseconds
    ##        expr   min    lq      mean median    uq   max neval
    ##  foo.v[-2L]   861  1148  1385.062   1271  1476 11808  1000
    ##  bar.v[-2L] 12382 14842 15671.225  15498 16031 77613  1000
    

    Remark

    The two setMethod calls above follow two good practices: