I want to get the row-column coordinates for all nonzero elements in a matrix M
. If M
isn't too big, it's straightforward:
m <- matrix(sample(0:1, 25, TRUE, prob=c(0.75, 0.25)), 5, 5)
#[,1] [,2] [,3] [,4] [,5]
#[1,] 0 0 0 0 0
#[2,] 1 1 0 0 0
#[3,] 0 0 0 1 0
#[4,] 0 0 1 0 0
#[5,] 0 0 0 0 0
nz <- which(m != 0)
cbind(row(m)[nz], col(m)[nz])
#[,1] [,2]
#[1,] 2 1
#[2,] 2 2
#[3,] 4 3
#[4,] 3 4
However, in my case M
is a sparse matrix (created using the Matrix package), whose dimensions can be very large. If I call row(M)
and col(M)
like above, I'll be generating a couple of dense matrices the same size as M
, which I definitely don't want to do.
Is there a way of getting a result like the above without creating dense matrices along the way?
I think you want
which(m!=0,arr.ind=TRUE)
Looking at showMethods("which")
, it seems that this is set up to work efficiently with sparse matrices. You can also get the answer more directly (but inscrutably) for a sparse, column-oriented matrix (provided it is not stored internally as a symmetric matrix: see comment below) by manipulating the internal @p
(column pointer) and @i
(row pointer) slots:
mm <- Matrix(m)
dp <- diff(mm@p)
cbind(mm@i+1,rep(seq_along(dp),dp))
@tflutre comments:
If
m
is symmetric, onlywhich(m != 0, arr.ind=TRUE)
always returns the full list of nonzero coordinates. The code usingmm@p
may not! Indeed,mm <- as(m, "sparseMatrix")
can automatically and "silently" detect that m is symmetric and, if so, it may only store the upper (or lower triangular values): look at the fieldmm@uplo
.