pythonnumpyunderflow

How to avoid underflow trying to zero out elements


I have a big numpy 2d array F which has complex numbers (np.complex64). It has a lot of very small numbers. For the sake of my calculation, I only need precision of ~1e-6 - 1e-9. Since this matrix is very large, I am trying to use a sparse matrix representation. So I try to do this:

np.seterr(all="raise")
...
F = getF()
F[np.abs(F) < EPSILON] = 0
# EPSILON = 1e-9. It is supposed to be in between 1e-6 and 1e-9
return csr_matrix(F)

But the computing the absolute value gives an underflow error (with numpy set to raise errors):

FloatingPointError: underflow encountered in absolute

Numpy doesn't raise an error if the seterr is not done, but just puts out NaNs which leads to problems as this matrix F is the starting point of a series of calculations.

From what I have read, the underflow is mainly handled by taking log and using the log values directly instead of the main ones, but in this case, I want to discard them all anyway. Is there a sane way of doing so? I thought of np.clip but I have complex number data so using it is not very straightforward.

So my question is whether there exists an elegant (and hopefully canonical) way of handling this?


Solution

  • First of, I can reproduce your error. As dawg pointed our this doesn't happen if you take float instead of complex. This is also the reason why option B works as the real and imag part are both arrays of floats. Another option (C) would be to use more bits to represent your data, I guess complex128 is the default for numpy.

    import numpy as np
    np.seterr(all="raise")
    eps = 1e-9
    
    def get_F(n=100):
        # generate some random data with some really small values
        r, i = np.random.random((2, n, n))**50
        F = r + 1j*i
        return F.astype('complex64')
    
    
    # A: fails
    F = get_F()
    F[np.abs(F) < eps] = 0
    
    # B: clip real and imag separatly
    F = get_F()
    F.real[np.abs(F.real) < eps] = 0
    F.imag[np.abs(F.imag) < eps] = 0
    
    # C: use more bits to represent your data
    F = get_F()
    F = F.astype('complex128')
    F[np.abs(F) < eps] = 0
    
    print('nonzeros in F:', np.count_nonzero(F))