I am trying to pass an S4 object (sparse matrix of class dgCMatrix
from the Matrix
package) to a function in another package (Wrench
). I am getting some very strange behavior that I am struggling to understand. I suspect it has something to do with S3/S4 method dispatch in R, a subject which I don't really understand.
Example setup:
library(Matrix)
library(Wrench)
M <- Matrix(10 + 1:28, 4, 7)
M[, c(2,4:6)] <- 0
sM <- as(M, "sparseMatrix")
print(sM)
# 4 x 7 sparse Matrix of class "dgCMatrix"
#
# [1,] 11 . 19 . . . 35
# [2,] 12 . 20 . . . 36
# [3,] 13 . 21 . . . 37
# [4,] 14 . 22 . . . 38
print(rowSums(sM))
# [1] 65 68 71 74
Next, I call the function, Wrench::wrench
to do my analysis:
wrench(sM, colData=c('A','B','A','B','B','A','B'))
# Error in h(simpleError(msg, call)) :
# error in evaluating the argument 'i' in selecting a method for function '[': 'x' must be an array of at least two dimensions
Strange... I set a breakpoint on that line in RStudio, step into wrench
, it starts like this:
function (mat, condition, etype = "w.marg.mean", ebcf = TRUE,
z.adj = FALSE, phi.adj = TRUE, detrend = FALSE, ...)
{
mat <- mat[rowSums(mat) > 0, ]
# ...
Step into that first line, into the call to rowSums(mat)
, and find myself in base::rowSums
.
function (x, na.rm = FALSE, dims = 1L)
{
if (is.data.frame(x))
x <- as.matrix(x)
if (!is.array(x) || length(dn <- dim(x)) < 2L)
stop("'x' must be an array of at least two dimensions")
is.data.frame(x)
is FALSE
, is.array(x)
is FALSE, the stop()
function is invoked and the error originates from within there. OK, so base::rowSums
doesn't know how to handle the sparse matrix and is raising the error...
But why is base::rowSums
being invoked when rowSums(mat)
is called within wrench
, whereas Matrix::rowSums.dgCMatrix
is being invoked and everything is working fine when I call rowSums(sM)
in the outer scope?
More generally, how do I debug this kind of problem, where method dispatch is not happening as expected (especially within a package that I don't control)?
Other things I have tried:
I tried using debugcall
, i.e. eval(debugcall(rowSums(sM)))
from the outer scope, and it takes me into somewhere in Matrix
which calls some C code and ultimately the row sums are printed. When I try the same thing with the debugger paused within Wrench::wrench
, I get an error:
Browse[1]> eval(debugcall(rowSums(mat)))
Error in .signatureFromCall(func, mcall, env) :
trying to get slot "signature" from an object of a basic class ("function") with no slots
I have read the S4 chapter of Advanced R several times
Your NAMESPACE
needs either
import(Matrix)
or
importFrom("Matrix", ...)
importClassesFrom("Matrix", ...)
importMethodsFrom("Matrix", "[", rowSums, ...)
the latter importing only as necessary (a practice that you are encouraged to follow).
As for debugging, note that several enclosures separate the evaluation environment of the function wrench
from your global environment and rest of the search path:
1. the evaluation environment of 'wrench'
2. the 'Wrench' namespace
3. the 'Wrench' imports
4. the 'base' namespace
5. the global environment { and the rest of search() }
You can verify that by calling parent.env
recursively inside of the debugger, like so:
> library(Wrench)
> debugonce(wrench)
> wrench()
Browse[2]> e <- environment()
Browse[2]> repeat { print(e); if (identical(e, .GlobalEnv)) break; e <- parent.env(e) }
debug at #1: print(e)
Browse[3]> c
<environment: 0x11e032028>
<environment: namespace:Wrench>
<environment: 0x11e84bf40>
attr(,"name")
[1] "imports:Wrench"
<environment: namespace:base>
<environment: R_GlobalEnv>
Browse[2]>
When you import methods for rowSums
from Matrix, an S4 generic rowSums
is placed in your package imports. The generic rowSums
there is found before the non-generic rowSums
in the base namespace, because your package imports are searched first. When called, the generic rowSums
dispatches an appropriate method from the corresponding methods table. If it does not find one, then it calls the default method, which for rowSums
is just the non-generic function.
Browse[2]> rowSums
standardGeneric for "rowSums" defined from package "base"
function (x, na.rm = FALSE, dims = 1, ...)
standardGeneric("rowSums")
<bytecode: 0x1182be180>
<environment: 0x1182b78f8>
Methods may be defined for arguments: x
Use showMethods(rowSums) for currently available ones.
Browse[2]> environment(rowSums)$.MTable$CsparseMatrix
Method Definition:
function (x, na.rm = FALSE, dims = 1L, ...)
{
.local <- function (x, na.rm = FALSE, dims = 1L, sparseResult = FALSE)
.Call("CRsparse_rowSums", x, na.rm, FALSE, sparseResult)
.local(x, na.rm, dims, ...)
}
<bytecode: 0x11b88dd60>
<environment: namespace:Matrix>
Signatures:
x
target "CsparseMatrix"
defined "CsparseMatrix"
Browse[2]> rowSums@default
Method Definition (Class "derivedDefaultMethod"):
function (x, na.rm = FALSE, dims = 1, ...)
base::rowSums(x, na.rm = na.rm, dims = dims, ...)
<environment: 0x118261ff8>
Signatures:
x
target "ANY"
defined "ANY"
Importantly, if Wrench imports from Matrix, then loading Wrench automatically loads Matrix, so methods for rowSums
exported by Matrix are always available in the methods table for use by functions in Wrench.
A subtle aspect of this example is that the [
operator, unlike rowSums
, is internally S4 generic. An S4 generic [
is not placed in your package imports when you import methods for [
, but S4 dispatch works anyway.
So, in theory, you can get away without importing methods for [
from Matrix, as long as you import something from Matrix to ensure that its methods are loaded whenever Wrench is loaded. But that's really just a technical detail; for clarity, do use NAMESPACE
to import all of the methods that your package might need.