I am trying to compute the m
first eigenvectors of a large sparse matrix in R. Using eigen()
is not realistic because large means N > 106 here.
So far I figured out that I should use ARPACK from the igraph
package, which can deal with sparse matrices. However I can't get it to work on a very simple (3x3) matrix:
library(Matrix)
library(igraph)
TestDiag <- Diagonal(3, 3:1)
TestMatrix <- t(sparseMatrix(i = c(1, 1, 2, 2, 3), j = c(1, 2, 1, 2, 3), x = c(3/5, 4/5, -4/5, 3/5, 1)))
TestMultipliedMatrix <- t(TestMatrix) %*% TestDiag %*% TestMatrix
And then using the code given in example of the help of the arpack()
function to extract the 2 first eigenvectors :
func <- function(x, extra=NULL) { as.vector(TestMultipliedMatrix %*% x) }
arpack(func, options=list(n = 3, nev = 2, ncv = 3, sym=TRUE, which="LM", maxiter=200), complex = FALSE)
I get an error message:
Error in arpack(func, options = list(n = 3, nev = 2, ncv = 3, sym = TRUE, :
At arpack.c:1156 : ARPACK error, NCV must be greater than NEV and less than or equal to N
I don't understand this error, as ncv (3) is greater than nev (2) here, and equal to N (3).
Am I making some stupid mistake or is there a better way to compute eigenvectors of a sparse matrix in R?
Update
This error is apparently due to a bug in the arpack()
function with uppercase / lowercase NCV and NEV.
Any suggestions to solve the bug (I tried to have a look at the package code but it is far too complex for me to understand) or compute the eigenvectors in an other way are welcome.
There are actually no bugs here, but you made a mistake putting sym=TRUE
into the ARPACK option list, but sym
is an argument of the arpack()
function. I.e. the correct call is:
ev <- arpack(func, options=list(n=3, nev=2, ncv=3, which="LM", maxiter=200),
sym=TRUE, complex = FALSE)
ev$values
# [1] 3 2
ev$vectors
# [,1] [,2]
# [1,] -6.000000e-01 -8.000000e-01
# [2,] 8.000000e-01 -6.000000e-01
# [3,] 2.220446e-16 -9.714451e-17
If you are interested in the details, what happens is that instead of the symmetric, the general non-symmetric eigensolver is called and for that NCV-NEV >= 2 is also required. From the ARPACK source (dnaupd.f):
...
c NOTE: 2 <= NCV-NEV in order that complex conjugate pairs of Ritz
c values are kept together. (See remark 4 below)
...
Some more comments, only loosely related to your question. arpack()
can be quite slow. The problem with it is that you need to call back to R from the C code in each iteration. See this thread: http://lists.gnu.org/archive/html/igraph-help/2012-02/msg00029.html
The bottom line is that arpack()
only helps if your matrix-vector product callback is fast and you don't need many iterations, the latter being related to the eigenstructure of the matrix.
I created an issue in the igraph issue tracker, to see if it would be possible to optionally use C callback, using Rcpp, instead of the R callback: https://github.com/igraph/igraph/issues/491 You can follow this issue if you are interested.