I have some array
a = np.array([1, 2, 3])
and some mask
mask = np.ones(a.shape, dtype=bool)
and can do
np.testing.assert_almost_equal(a[mask], a) # True
However,
np.ma.array(a, mask)
is equivalent to
a[np.logical_not(mask)]
and
np.ma.array(a, np.logical_not(mask))
is equivalent to
a[mask]
This seems counter intuitive to me.
Would love an explanation for this design choice by numpy.
In [6]: a = np.array([1,2,3])
In [7]: idx = np.array([1,0,1], bool)
In [8]: idx
Out[8]: array([ True, False, True])
In [9]: a[idx]
Out[9]: array([1, 3])
Just because you called a boolean array mask
, does not mean it behaves as 'mask' in every sense of the word. I intentionally choose a different name. Yes, we do often call such an array mask
and talk of 'masking', but what we are really doing is 'selecting'. The a[idx]
operations returns the elements of a
where the idx
is True. It's the same as indexing with the nonzero
tuple:
In [13]: np.nonzero(idx)
Out[13]: (array([0, 2]),)
In np.ma
mask is used in the sense of 'mask out', covering over.
In [10]: mm = np.ma.masked_array(a, mask=idx)
In [11]: mm
Out[11]:
masked_array(data=[--, 2, --],
mask=[ True, False, True],
fill_value=999999)
In [12]: mm.compressed()
Out[12]: array([2])
In the display the masked values show up as '--'. As the np.ma
docs say, those elements a considered to be invalid, and will be excluded from computations.
mm.filled
returns an array with the 'masked' value replaced by the 'fill':
In [16]: mm.filled()
Out[16]: array([999999, 2, 999999])
we can do the same thing with idx
:
In [17]: a[idx] = 999999
In [18]: a
Out[18]: array([999999, 2, 999999])