numpymatplotlibcartopyhdfsatellite-image

Python: Change color of bad pixels


I have an HDF file that contains satellite data and when plotted, looks like the attached image below. It has a lot of bad pixels (they're colored black when using this particular colormap). How can I change the color of bad pixels to white for example? Thank you.

enter image description here

Link to HDF file: https://drive.google.com/file/d/1AYhvqxNlvlqQB7otn_veLkO3t2qPogV_/view?usp=sharing

Here is a simplified code that produces the image:

import h5py
import numpy as np
import matplotlib.pyplot as plt
import cartopy.crs as ccrs

filename = 'NmHRIR1H.19660626_14-57-45_564_005.hdf'

with h5py.File(filename) as f:  
    temp = f['HRIR-Temp'][:]-273.15 # Convert K to C
    lat = f['latitude'][:]
    lon = f['longitude'][:]
    
    f.close()

fig = plt.figure(figsize=(8,8))
ax = plt.axes(projection=ccrs.Mercator())

extent = [124,139,12,27]
ax.set_extent(extent, crs=ccrs.PlateCarree())

ax.coastlines(resolution='10m', color='gold', linewidth=0.5)

ax.pcolormesh(lon, lat, temp, cmap='binary', vmin=-60, vmax=40, transform=ccrs.PlateCarree())

plt.show()

UPDATE: I checked the documentation for this dataset again, and it has a field called "Flags" that has bit flags stored in it. Bit 4 in particular seems to be the key for me to achieve what I want (when bit 4 is on, it means bad data). enter image description here enter image description here


Solution

  • The Flags never seem to get a value higher than 3, so I don't think that will help you out for this specific issue. I don't see any attributes specifying the nodata value, or the documentation mentioning it, but it's quite obvious that the maximum float32 value is used for this.

    The coordinates outside the swath are filled with this value, and also the extreme low (?) values in the temperature.

    As @Michael Delgado already mentioned, cmap.set_bad() is a good way to assign a specific color (or alpha!) to these values. But it requires Matplotlib to know what is "bad". In the current form, you could use cmap.set_over() but that will loose the distinction with valid values that are clipped (temp > vmax).

    It's usually best to parse a nodata value from the metadata of the file, but I don't see it right away (didn't look that careful). When hard coding it:

    import h5py
    import numpy as np
    import cartopy.crs as ccrs
    import matplotlib.pyplot as plt
    import matplotlib as mpl
    
    fn = 'NmHRIR1H.19660626_14-57-45_564_005.hdf'
    nodata_value = np.finfo(np.float32).max
    
    with h5py.File(fn) as f:
        temp = np.ma.masked_values(f['HRIR-Temp'][:], nodata_value) - 273.15
        lat = np.ma.masked_values(f['latitude'][:], nodata_value)
        lon = np.ma.masked_values(f['longitude'][:], nodata_value)
    
    cmap = mpl.colormaps["binary"].copy()
    cmap.set_bad("r", alpha=1)
    norm = mpl.colors.Normalize(vmin=-60, vmax=40)
    
    fig, ax = plt.subplots(
        figsize=(6,6), dpi=86, facecolor="w", layout="compressed", 
        subplot_kw=dict(projection=ccrs.PlateCarree()),
    )
    
    im = ax.pcolormesh(lon, lat, temp, cmap=cmap, norm=norm)
    cb = fig.colorbar(im, ax=ax, shrink=.5, extend="both")
    
    ax.coastlines(resolution='10m', color='gold', linewidth=0.5)
    
    extent = [124,139,12,27]
    ax.set_extent(extent, crs=ccrs.PlateCarree())
    

    enter image description here