rbioconductorr-s4

Debugging unexpected S4 method dispatch


I am trying to pass an S4 object (sparse matrix of class dgCMatrix from the Matrix package) to a function in another package (Wrench). I am getting some very strange behavior that I am struggling to understand. I suspect it has something to do with S3/S4 method dispatch in R, a subject which I don't really understand.

Example setup:

library(Matrix)
library(Wrench)

M <- Matrix(10 + 1:28, 4, 7)
M[, c(2,4:6)] <- 0
sM <- as(M, "sparseMatrix")
print(sM)
# 4 x 7 sparse Matrix of class "dgCMatrix"
#                     
# [1,] 11 . 19 . . . 35
# [2,] 12 . 20 . . . 36
# [3,] 13 . 21 . . . 37
# [4,] 14 . 22 . . . 38

print(rowSums(sM))
# [1] 65 68 71 74

Next, I call the function, Wrench::wrench to do my analysis:

wrench(sM, colData=c('A','B','A','B','B','A','B'))
# Error in h(simpleError(msg, call)) : 
#   error in evaluating the argument 'i' in selecting a method for function '[': 'x' must be an array of at least two dimensions

Strange... I set a breakpoint on that line in RStudio, step into wrench, it starts like this:

function (mat, condition, etype = "w.marg.mean", ebcf = TRUE, 
  z.adj = FALSE, phi.adj = TRUE, detrend = FALSE, ...) 
{
  mat <- mat[rowSums(mat) > 0, ]
  # ...

Step into that first line, into the call to rowSums(mat), and find myself in base::rowSums.

function (x, na.rm = FALSE, dims = 1L) 
{
  if (is.data.frame(x)) 
    x <- as.matrix(x)
  if (!is.array(x) || length(dn <- dim(x)) < 2L) 
    stop("'x' must be an array of at least two dimensions")

is.data.frame(x) is FALSE, is.array(x) is FALSE, the stop() function is invoked and the error originates from within there. OK, so base::rowSums doesn't know how to handle the sparse matrix and is raising the error...

But why is base::rowSums being invoked when rowSums(mat) is called within wrench, whereas Matrix::rowSums.dgCMatrix is being invoked and everything is working fine when I call rowSums(sM) in the outer scope?

More generally, how do I debug this kind of problem, where method dispatch is not happening as expected (especially within a package that I don't control)?

Other things I have tried:


Solution

  • Your NAMESPACE needs either

    import(Matrix)
    

    or

    importFrom("Matrix", ...)
    importClassesFrom("Matrix", ...)
    importMethodsFrom("Matrix", "[", rowSums, ...)
    

    the latter importing only as necessary (a practice that you are encouraged to follow).

    As for debugging, note that several enclosures separate the evaluation environment of the function wrench from your global environment and rest of the search path:

    1. the evaluation environment of 'wrench'
    2. the 'Wrench' namespace
    3. the 'Wrench' imports
    4. the 'base' namespace
    5. the global environment { and the rest of search() }
    

    You can verify that by calling parent.env recursively inside of the debugger, like so:

    > library(Wrench)
    > debugonce(wrench)
    > wrench()
    Browse[2]> e <- environment()
    Browse[2]> repeat { print(e); if (identical(e, .GlobalEnv)) break; e <- parent.env(e) }
    debug at #1: print(e)
    Browse[3]> c
    <environment: 0x11e032028>
    <environment: namespace:Wrench>
    <environment: 0x11e84bf40>
    attr(,"name")
    [1] "imports:Wrench"
    <environment: namespace:base>
    <environment: R_GlobalEnv>
    Browse[2]> 
    

    When you import methods for rowSums from Matrix, an S4 generic rowSums is placed in your package imports. The generic rowSums there is found before the non-generic rowSums in the base namespace, because your package imports are searched first. When called, the generic rowSums dispatches an appropriate method from the corresponding methods table. If it does not find one, then it calls the default method, which for rowSums is just the non-generic function.

    Browse[2]> rowSums
    standardGeneric for "rowSums" defined from package "base"
    
    function (x, na.rm = FALSE, dims = 1, ...) 
    standardGeneric("rowSums")
    <bytecode: 0x1182be180>
    <environment: 0x1182b78f8>
    Methods may be defined for arguments: x
    Use  showMethods(rowSums)  for currently available ones.
    Browse[2]> environment(rowSums)$.MTable$CsparseMatrix
    Method Definition:
    
    function (x, na.rm = FALSE, dims = 1L, ...) 
    {
        .local <- function (x, na.rm = FALSE, dims = 1L, sparseResult = FALSE) 
        .Call("CRsparse_rowSums", x, na.rm, FALSE, sparseResult)
        .local(x, na.rm, dims, ...)
    }
    <bytecode: 0x11b88dd60>
    <environment: namespace:Matrix>
    
    Signatures:
            x              
    target  "CsparseMatrix"
    defined "CsparseMatrix"
    Browse[2]> rowSums@default
    Method Definition (Class "derivedDefaultMethod"):
    
    function (x, na.rm = FALSE, dims = 1, ...) 
    base::rowSums(x, na.rm = na.rm, dims = dims, ...)
    <environment: 0x118261ff8>
    
    Signatures:
            x    
    target  "ANY"
    defined "ANY"
    

    Importantly, if Wrench imports from Matrix, then loading Wrench automatically loads Matrix, so methods for rowSums exported by Matrix are always available in the methods table for use by functions in Wrench.

    A subtle aspect of this example is that the [ operator, unlike rowSums, is internally S4 generic. An S4 generic [ is not placed in your package imports when you import methods for [, but S4 dispatch works anyway.

    So, in theory, you can get away without importing methods for [ from Matrix, as long as you import something from Matrix to ensure that its methods are loaded whenever Wrench is loaded. But that's really just a technical detail; for clarity, do use NAMESPACE to import all of the methods that your package might need.