scipyouter-join

SciPy equivalent of np.minimum.outer


I need to do the outer product of two (large) sparse vectors, taking a minimum of the values instead of multiplication. I need an efficient way to perform this. The build-in point-wise multiplication takes 30ms, and my (naive) code takes 5 minutes

%%timeit -r 1
# c1, c2 are .nonzero of sparse vector
minl = lil_matrix((c1.shape[1], c2.shape[1]))
for i1 in c1.nonzero()[1]:
    for i2 in c2.nonzero()[1]:
        minl[i1, i2] = min(c1[0, i1], c2[0, i2])

Solution

  • Maybe I should insist on a [mcve], but let's make a guess as to what's a reasonable sample:

    In [107]: from scipy import sparse    
    In [108]: c1 = sparse.random(1,100,.2,'csr')    
    In [109]: c2 = sparse.random(1,100,.2,'csr')
    
    In [110]: def foo(c1,c2):
         ...:     minl = sparse.lil_matrix((c1.shape[1], c2.shape[1]))
         ...:     for i1 in c1.nonzero()[1]:
         ...:         for i2 in c2.nonzero()[1]:
         ...:             minl[i1, i2] = min(c1[0, i1], c2[0, i2])
         ...:     return minl
         ...: 
    
    In [111]: LL1 = foo(c1,c2)
    
    In [112]: LL1
    Out[112]: 
    <100x100 sparse matrix of type '<class 'numpy.float64'>'
        with 400 stored elements in List of Lists format>
    

    dense with broadcasting

    For dense arrays a simple element-wise minimum for matching array lengths is:

    In [118]: np.minimum(c1.A,c2.A).shape
    Out[118]: (1, 100)
    

    and with broadcasting we can do an 'outer' minimum:

    In [119]: np.minimum(c1.T.A,c2.A).shape
    Out[119]: (100, 100)
    

    and that matches your iterative answer:

    In [122]: np.allclose(np.minimum(c1.T.A,c2.A),LL1.A)
    Out[122]: True
    

    And comparative times:

    In [123]: timeit LL1 = foo(c1,c2)
    37.8 ms ± 240 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
    
    In [124]: timeit np.minimum(c1.T.A,c2.A)
    257 µs ± 3.21 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    

    But sparse doesn't do broadcasting.

    I suggested building a coo matrix in the loop

    Another possibility is to 'replicate' c1.T and c2 so we can do the minimum.

    expanded sparse matrices

    I can expand the matrics by multiplying by the appropriate sparse ones matrix:

    In [135]: (c1.T*sparse.csr_matrix(np.ones((1,100)))).minimum(sparse.csr_matrix(np.ones((100,1))*c2))
    Out[135]: 
    <100x100 sparse matrix of type '<class 'numpy.float64'>'
        with 400 stored elements in Compressed Sparse Column format>
    
    In [136]: np.allclose(_.A,LL1.A)
    Out[136]: True
    

    That times at

    1.91 ms ± 15.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    

    and packaged neatly

    def foo1(c1,c2):
        On1 = sparse.csr_matrix(np.ones((1,c2.shape[1])))
        On2 = sparse.csr_matrix(np.ones((c1.shape[1],1)))
        cc1 = c1.T*On1
        cc2 = On2*c2
        return cc1.minimum(cc2)
    

    List iteration

    Here's a "pure python" iterative version

    def foo0(c1,c2):
        data, row, col = [],[],[]
        cc = c1.tocoo(); d1 = cc.data.tolist(); l1 = cc.col.tolist()
        cc = c2.tocoo(); d2 = cc.data.tolist(); l2 = cc.col.tolist()
        for i,i1 in enumerate(l1):
            for j,i2 in enumerate(l2):
                data.append(min(d1[i],d2[j]))
                row.append(i1)
                col.append(i2)
        minl = sparse.coo_matrix((data, (row,col)), shape=(c1.shape[1],c2.shape[1]))
        return minl
    
    In [170]: foo0(c1,c2)
    Out[170]: 
    <100x100 sparse matrix of type '<class 'numpy.float64'>'
        with 400 stored elements in COOrdinate format>
    
    In [171]: np.allclose(_.A,LL1.A)
    Out[171]: True
    
    In [172]: timeit foo0(c1,c2)
    576 µs ± 11 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    

    We don't have to iterate on the c2, instead using array methods

    def foo02(c1,c2):
        data, row, col = [],[],[]
        cc = c1.tocoo(); d1 = cc.data.tolist(); l1 = cc.col.tolist()
        cc = c2.tocoo(); d2 = cc.data; l2 = cc.col
        for i,i1 in enumerate(l1):
            data.append(np.minimum(d1[i],d2))
            row.append([i1]*l2.shape[0])
            col.append(l2)
        data = np.hstack(data)
        row = np.hstack(row)
        col = np.hstack(col)
        minl = sparse.coo_matrix((data, (row,col)), shape=(c1.shape[1],c2.shape[1]))
        return minl
    

    This has a modest time improvement on foo0. It may scale better.

    Your suggested change

    def foo11(c1,c2):
        On1 = c2.copy(); On1.data[:] = 1
        On2 = c1.T.copy(); On2.data[:] = 1
        cc1 = c1.T*On1
        cc2 = On2*c2
        return cc1.minimum(cc2)
    

    1.21 ms ± 14 µs per loop