pythonscipysparse-matrix

scipy.sparse: one subarray at many locations in a larger array


Say I have a sparse subarray whose contents and shape are known:

import scipy.sparse as sp
sub = sp.coo_array([[a, b], [c, d]])

I'd like to place this subarray at many locations, according to some known pattern, in a larger sparse array of arbitrary size. The code can compute the size of the large array (say NxN) but the numerical value of N isn't known to me.

Using our contrived NxN and an arbitrary pattern, if N=4, the end result might be:

[[a, b, a, b]
 [c, d, c, d]
 [0, 0, a, b]
 [0, 0, c, d]]

and for N=8:

[[a, b, a, b, 0, 0, 0, 0]
 [c, d, c, d, 0, 0, 0, 0]
 [0, 0, a, b, a, b, 0, 0]
 [0, 0, c, d, c, d, 0, 0]
 [0, 0, 0, 0, a, b, a, b]
 [0, 0, 0, 0, c, d, c, d]
 [0, 0, 0, 0, 0, 0, a, b]
 [0, 0, 0, 0, 0, 0, c, d]]

N can be huge (tens of thousands), therefore a dense array is not computationally manageable, and it's not practical to manually specify each of the locations.

sp.block_array appears to require specifying placement manually. sp.block_diag gets close to the contrived example but doesn't handle an arbitrary pattern. This question is similar but a) uses numpy, not scipy.sparse, and b) doesn't address placing the same subarray many times. This answer is related but only practical for arrays where N is small: in my case, alist would be huge and mostly full of None, even if I use a loop as suggested, which defeats the purpose of sparse arrays.

An ideal solution might have a list of array indices at which the top left corner of the submatrix is placed. It should work with any pattern, even a bunch of random locations within NxN. I'm not very experienced with scipy.sparse but it seems like there should be a "good" way to do this sort of operation.


Solution

  • Thanks to Reinderien's comment, I was able to figure this out - I had no idea what a Kronecker product was until now. sp.kron does exactly what I want, with the added benefit of being able to multiply each block by a coefficient.

    For the contrived example, the code to specify the pattern would be:

    import scipy.sparse as sp
    import numpy as np
    
    # Setup subarray and big array parameters
    a, b, c, d = 1, 2, 3, 4
    sub = sp.coo_array([[a, b], [c, d]])
    N = 8
    
    # Setup block locations for our arbitrary pattern
    row_idx = np.hstack((np.arange(N/sub.shape[0], dtype=int), np.arange(N/sub.shape[0]-1, dtype=int)))
    col_idx = np.hstack((np.arange(N/sub.shape[1], dtype=int), np.arange(N/sub.shape[0]-1, dtype=int)+1))
    coeff = np.ones_like(row_idx) # Multiply blocks by coefficients here
    locs = sp.csc_array((coeff, (row_idx, col_idx))) # Array of coefficients at specified locations
    
    # Not necessary, but shows what's going on.
    print(f'Placing block top left corners at rows{row_idx*sub.shape[0]}, cols {col_idx*sub.shape[1]}')
    

    Actually creating the sparse array is a one-liner once the locations and subarray are specified:

    arr = sp.kron(locs, sub)
    

    print(arr.toarray()) yields:

    [[1 2 1 2 0 0 0 0]
     [3 4 3 4 0 0 0 0]
     [0 0 1 2 1 2 0 0]
     [0 0 3 4 3 4 0 0]
     [0 0 0 0 1 2 1 2]
     [0 0 0 0 3 4 3 4]
     [0 0 0 0 0 0 1 2]
     [0 0 0 0 0 0 3 4]]
    

    This implementation...