graphfortransparse-matrixcircuitpardiso

Fast accessing elements of Compressed Sparse Row (CSR) sparse matrix


I want to test some of the newer sparse linear solvers and I want to know if there is a fast way of filling in the matrix. The format I'm interested is CSR (http://goo.gl/hLXYd). Let's say the matrix, in CSR format, is given by:

values(num non-zero elements)
columns(num non-zero elements)
rowIndex(num rows + 1)

The sparse matrix under consideration derives from networks. So, I have thousands of nodes and some of them are connected between them by lines. So, the matrix is structurally symmetric. Each connection (i,j) adds something to the diagonal terms (i,i) and (j,j) and to the off-diagonal (i,j) and (j,i). I could have several connections between the same nodes (i,j,1), (i,j,2)... So, I might need to revisit the 2 diagonal and 2 off-diagonal elements more than once.

I know I can get the beginning of the row by doing rowIndex(i). Then, I would have to run through the elements columns(rowIndex(i):rowIndex(i+1)-1) to find where is j situated.

The question:

Is there a way of accessing the elements faster, while in CSR format, without having to do a search every time I want to update an element?

Some clarifications: I just need to fill in the matrix from scratch. The matrix is structurally symmetric and not really symmetric. The values saved have to do with network data (impedances, resistances etc), they have real values. In general Value(i,j)<>Value(j,i). I have tuples of the form (name1,i1,j1,value1), (name2,i2,j2,value2) etc. These tuples are not sorted, and 2 tuples can refer to the same i,j values, meaning they need to be added

Thanks in advance!


Solution

  • What you have is so called triplet sparse format. Creation of CRS, including removing duplicate entries and summing the values, can be implemented very efficiently. Before programing it yourself, have a look at the SuiteSparse library. It is written in C, but I'm sure you will understand the principle. What interests you is the cholmod_triplet.c file, which implements the functionality you need.

    Essentially, the conversion is performed using two phase bucket sort on your row and column indices. This algorithm has linear complexity, which is important if you are interested in processing large data sets.

    Edit If you want to skip explicit creation of the triplet format all together, you can do that by generating the (row, col) connectivities on the fly and adding them to a dynamic sparse structure. I usually do it using insertion sort and sorted lists, which is in practice the fastest. It is also faster than triplet to CRS conversion, and uses much less memory. The method goes as follows:

    This method is faster than using triplet to sparse conversion, at least for FEM models, for which I use it. The general reason is that memory bandwidth is the bottleneck here, and the above scheme uses much less memory:

    Have a look at a performance comparison of using those two methods (Figure 1) for triangular elements in 2D. Note that the performance difference depends on the ratio of the number of entries in the triplet to assembled sparse matrix format (Table 2). But in general, the method is never worse than triplet to crs conversion, and triplets need to be created in the first place. You can also download a MATLAB MEX function sparse_create, which is a part of mutils package (see the downloads section).