Assigning values to numpy array based on multiple conditions of multiple array

I have the ocean and atmospheric dataset in netcdf file. Ocean data will contain nan or any other value -999 over land area. For this eample, say it is nan. Sample data will look like this:-

import numpy as np
ocean = np.array([[2, 4, 5], [6, np.nan, 2], [9, 3, np.nan]])
atmos = np.array([[4, 2, 5], [6, 7, 3], [8, 3, 2]])

Now I wanted to apply multiple conditions on ocean and atmos data to make a new array which will have only values from 1 to 8. For example in ocean data, values between 2 and 4 will be assigned as 1 and values between 4 and 6 will be assigned as 2. The same comparison goes to atmos dataset as well.

To simplify the comparison and assignment operation, I made a list of bin values and used np.digitize to make categories.

bin1 = [2, 4, 6]
bin2 = [4, 6, 8]
ocean_cat = np.digitize(ocean, bin1)
atmos_cat = np.digitize(atmos, bin2)

which produces the following result:-

[[1 2 2]
 [3 3 1]
 [3 1 3]]

[[1 0 1]
 [2 2 0]
 [3 0 0]]

Now I wanted element-wise maximum between the above two array results. Therefore, I used np.fmax to get the element-wise maximum.

final_cat = np.fmax(ocean_cat, atmos_cat)
print(final_cat)

which produces the below result:-

[[1 2 2]
 [3 3 1]
 [3 1 3]]

The above result is almost what I need. The only issue I find here is the missing nan value. What I wanted in the final result is:-

[[1 2 2]
 [3 nan 1]
 [3 1 nan]]

Can someone help me to replace the values with nan from the same index of original ocean array?

Solution

A simple option would be to mask the output with numpy.where:

bin1 = [2, 4, 6]
bin2 = [4, 6, 8]
ocean_cat = np.digitize(ocean, bin1)
atmos_cat = np.digitize(atmos, bin2) 
final_cat = np.where(np.isnan(ocean), np.nan,
                     np.fmax(ocean_cat, atmos_cat))

If both arrays can have NaNs:

final_cat = np.where(np.isnan(ocean)|np.isnan(atmos),
                     np.nan,
                     np.fmax(ocean_cat, atmos_cat))

Or np.isnan(ocean)&np.isnan(atmos) if you only want a NaN when both inputs are NaN.

Output:

array([[ 1.,  2.,  2.],
       [ 3., nan,  1.],
       [ 3.,  1., nan]])

Generic approach for any number of input arrays:

arrays = [ocean, atmos]
bins = [bin1, bin2]

out = np.where(np.logical_or.reduce([np.isnan(a) for a in arrays]),
               np.nan,
               np.fmax.reduce([np.digitize(a, b) for a,b in zip(arrays, bins)])
               )