pythonscipysparse-matrixmemory-size

Retrieving the numer of bytes consumed by a SciPy sparse matrix


Let's say I want to monitor the memory occupied by my SciPy sparse matrix mat. In NumPy I would have exploited the nbytes attribute, but in SciPy it seems there is nothing like that. How can I retrieve this information?


Solution

  • I have sparse matrix X

    In [605]: X
    Out[605]: 
    <100x100 sparse matrix of type '<class 'numpy.float64'>'
        with 1000 stored elements in Compressed Sparse Row format>
    

    getsizeof doesn't tell me anything useful

    In [606]: import sys
    In [607]: sys.getsizeof(X)
    Out[607]: 28
    

    The sparse data and indices are, for a csr matrix stored in 3 arrays:

    In [612]: X.data.nbytes
    Out[612]: 8000
    In [613]: X.indices.nbytes
    Out[613]: 4000
    In [614]: X.indptr.nbytes
    Out[614]: 404
    

    So roughly the total space is the sum of those values.

    For coo format

    In [615]: Xc=X.tocoo()
    In [616]: Xc.data.nbytes
    Out[616]: 8000
    In [617]: Xc.row.nbytes
    Out[617]: 4000
    In [618]: Xc.col.nbytes
    Out[618]: 4000
    

    We could calculate those values from shape, dtype and nnz; e.g. 8 bytes * 1000, 4bytes * 1000, 4bytes * X.shape[0], etc.

    Other formats require knowledge of their data storage methods (e.g. lil, dok, etc).