I have a 2D-array with zero values in each row.
[[5, 3, 2, 0, 0, 1, 6, 9, 11, 1, 4, 1],
[0, 0, 12, 0, 1, 0, 0, 2, 0, 30, 2, 2],
[120, 2, 10, 3, 0, 0, 2, 7, 9, 5, 0, 0]]
Is there a way to calculate the 0.75 quantile of each row by excluding the zero values in the calculation ?
For example, in the second row, only 6 non-zero values[12,1,2,30,2,2]
should be used in the calculation. I tried using np.quantile()
but it will includes all zero values in the calculation. It seems that Numpy don't have masked array np.ma
version of quantile()
also.
You can replace the zero values with nan
and pass the array into np.nanquantile()
to compute the quantile of non-nan
values
>>> arr = np.array([[5, 3, 2, 0, 0, 1, 6, 9, 11, 1, 4, 1],
[0, 0, 12, 0, 1, 0, 0, 2, 0, 30, 2, 2],
[120, 2, 10, 3, 0, 0, 2, 7, 9, 5, 0, 0]], dtype='f')
>>> arr[arr==0] = np.nan
>>> arr
[[ 5. 3. 2. nan nan 1. 6. 9. 11. 1. 4. 1.]
[ nan nan 12. nan 1. nan nan 2. nan 30. 2. 2.]
[120. 2. 10. 3. nan nan 2. 7. 9. 5. nan nan]]
>>> arr_quantile75 = np.nanquantile(arr, 0.75, axis=1) #by row-axis
>>> arr_quantile75
[5.75 9.5 9.25]
np.nanquantile()
compute the qth quantile of the data along the specified axis, while ignoring nan values[source]