pythonnumpysummean

How to ignore values when using numpy.sum and numpy.mean in matrices


Is there a way to avoid using specific values when applying sum and mean in numpy?

I'd like to avoid, for instance, the -999 value when calculating the result.

In [14]: c = np.matrix([[4., 2.],[4., 1.]])

In [15]: d = np.matrix([[3., 2.],[4., -999.]])

In [16]: np.sum([c, d], axis=0)
Out[16]:
array([[   7.,    4.],
       [   8., -998.]])

In [17]: np.mean([c, d], axis=0)
Out[17]:
array([[   3.5,    2. ],
       [   4. , -499. ]])

Solution

  • Use a masked array:

    >>> c = np.ma.array([[4., 2.], [4., 1.]])
    >>> d = np.ma.masked_values([[3., 2.], [4., -999]], -999)
    
    >>> np.ma.array([c, d]).sum(axis=0)
    masked_array(data =
     [[7.0 4.0]
     [8.0 1.0]],
                 mask =
     [[False False]
     [False False]],
           fill_value = 1e+20)
    
    >>> np.ma.array([c, d]).mean(axis=0)
    masked_array(data =
     [[3.5 2.0]
     [4.0 1.0]],
                 mask =
     [[False False]
     [False False]],
           fill_value = 1e+20)