pythonarraysnumpycountingnumpy-einsum

Counting zeros in large numpy arrays without creating them


To illustrate the problem I am facing, here is some example code:

a = np.round(np.random.rand(10, 15))
counta = np.count_nonzero(a, axis=-1)
print(counta)

A = np.einsum('im,mj->ijm', a, a.T)
countA = np.count_nonzero(A, axis=-1)
print(countA)

It creates a 2D array and counts its nonzero elements along the last axis. Then it creates a 3D array, of which the nonzero elements are again counted along the last axis.

My problem is that my array a is so large, that I can perform the first step, but not the second step, since the A array would take up too much memory.

Is there any way to still get countA? That is to count the zeros in A along a given axis without actually creating the array?


Solution

  • I think you can simply use a matrix multiply (dot product) to get your result, no need to generate that massive 3D array A:

    a = np.round(np.random.rand(10, 15)).astype(int)
    counta = np.count_nonzero(a, axis=-1)
    
    A = np.einsum('im,mj->ijm', a, a.T)
    countA = np.count_nonzero(A, axis=-1)
    
    assert np.all(countA == (a @ a.T))
    

    This is also much faster:

    a = np.round(np.random.rand(1000, 1500)).astype(int)
    
    %timeit np.count_nonzero(np.einsum('im,mj->ijm', a, a.T), axis=-1)
    3.94 s ± 38.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    
    %timeit a @ a.T
    558 ms ± 6.74 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    

    Also note that your first step is redundant with the second:

    assert np.all(counta == np.diag(countA))