pythonpython-xarraynetcdfcartopy

Plotting Himawari-8 NetCDF data - Axes don't match array shape


This is quite related to this post last year for reference. I'm trying to visualize Himawari-8 AHI centered around Philippines as well and downloaded through NCI Thredds Server. Unfortunately, I ran into some problems, particularly in making a TrueColor image (similar to this one) i.e. stacking and plotting it via Cartopy.

Upon reading with xarray, the Bands 1, 2, and 4 (veggie) have a similar resolution of 1000 m and dimensions, however Band 3 has a resolution of 500 m instead. Given that they have different resolutions, I initially sliced each of the NetCDF data's x and y coords before merging them with xr.merge.

I tried to stack each band with np.stack but unfortunately, it gives off an error;

Axes don't match array shape. Got (4, 1), expected (2999, 2999).

At the moment, the code runs like this:

import xarray as xr
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
from pathlib import Path
import numpy as np

# I/O
data_dir = Path("New folder (3)")

ds1 = xr.open_dataset(data_dir/ '20160814080000-P1S-ABOM_OBS_B01-PRJ_GEOS141_1000-HIMAWARI8-AHI.nc')
ds2 = xr.open_dataset(data_dir/ '20160814080000-P1S-ABOM_OBS_B02-PRJ_GEOS141_1000-HIMAWARI8-AHI.nc')
ds3 = xr.open_dataset(data_dir/ '20160814080000-P1S-ABOM_OBS_B03-PRJ_GEOS141_500-HIMAWARI8-AHI.nc')
ds4 = xr.open_dataset(data_dir/ '20160814080000-P1S-ABOM_OBS_B04-PRJ_GEOS141_1000-HIMAWARI8-AHI.nc')

# x and y coordinates are in meters already. 
# We select a portion of the FDK focused around Luzon i.e. Central to Northern Luzon

dx1 = ds1.isel(x=slice(3000, 4000), y=slice(3000, 4000))
#dx1

dx2 = ds2.isel(x=slice(3000, 4000), y=slice(3000, 4000))
#dx2

# Has a different resolution, matched up to the 1000 m of other bands
dx3 = ds3.isel(x=slice(6000, 8000), y=slice(6000, 8000))
#dx3

dx4 = ds4.isel(x=slice(3000, 4000), y=slice(3000, 4000))
#dx4

dx = xr.merge([dx1, dx2, dx3, dx4])

central_longitude = dx['geostationary'].longitude_of_projection_origin
satellite_height = dx['geostationary'].satellite_height

# convert km to m
mapx = dx['x'].to_numpy()
mapy = dx['y'].to_numpy()

rgb = np.stack((
    dx['channel_0001_scaled_radiance'].to_numpy(), # Band 1 is blue (0.47 um)
    dx['channel_0002_scaled_radiance'].to_numpy(), # Band 2 is green (0.51 um)
    dx['channel_0003_scaled_radiance'].to_numpy(), # Band 3 is red (0.64 um)
    dx['channel_0004_scaled_radiance'].to_numpy(), # Band 4 is "veggie" infrared (0.865 um)
))

# stretch RGB values a little
rgb_stretched = np.clip((rgb/0.85)**0.85, 0, 1)

# define projections
data_proj = ccrs.Geostationary(
    central_longitude=central_longitude, 
    satellite_height=satellite_height,
)
map_proj =  ccrs.Miller(central_longitude=central_longitude)

# plotting
fig, ax = plt.subplots(
    figsize=(20, 12), facecolor="w", dpi=300,
    subplot_kw=dict(projection=map_proj),
)

datacrs = ccrs.PlateCarree()

pcm = ax.pcolorfast(mapx, mapy, rgb_stretched, transform=data_proj)

ax.coastlines(color="w")

I believe I missed out or did something incorrect on the process due to my novice python skills. Is there a way to fix this out so that the Himawari-8 sat image can be displayed? Attached here is the sample data used on the code.


Solution

  • There are several issues with this. But let me start out by mentioning that there are dedicated libraries available for this, like SatPy (https://satpy.readthedocs.io/en/stable/index.html#), those might be more convenient to some.

    Issue 1) The merging with Xarray is performed on the (data) coordinates, which are different different (shifted) for the 500m band. This introduces a lot of NaN values "in between".

    Solution 1) You could select every other pixel from the high-res band, assuming both grids are a aligned (eg .slice(6000, 8000, 2)). But a more robust method would be to interpolate/resample all data to a common grid. Easiest would be to resample band 3 to the other ones, but you can do it the other way around to retain the 500m resolution of band 3. So:

    dx = xr.merge([dx1, dx2, dx3.interp_like(dx1), dx4])
    

    Issue 2) Matplotlib expects an RGB format with the color information on the last dimension (x, y, color). Your example also has the time dimension still present, you have (color, time, x, y).

    Solution 2) Since you only have one timestamp, you can just remove it by adding isel(time=0) to your initial slice. That drops the time dimensions. For example for the first band:

    dx1 = ds1.isel(time=0, x=slice(3000, 4000, 1), y=slice(3000, 4000, 1))
    

    And specifying the last dimension when stacking the Numpy arrays adds the color information to the last dimension (instead of the default first), so:

    rgb = np.stack((...), axis=-1)
    

    I would also remove the 4th NIR band, otherwise that would be interpreted as the alpha (transparency), I suspect that's not what you intended, but feel free to add it back, or select a different combination of bands for the final image.

    Issue 3) Matplotlib's pcolorfast (or pcolormesh etc) expects coordinate arrays to specify the edges, so it needs an extra coordinate element compared to the data (N+1, M+1). This might be a recent change in Matplotlib (?).

    Solution 3) Since you're dealing with a regular grid, a faster solution to begin with would be to just provide the outer edges/extent of the grid. This can be done by selecting the outer coordinates defining the center of that pixel, and adding/subtracting half the resolution (500m in the example below).

    mapx = (mapx[0]-500, mapx[-1]+500)
    mapy = (mapy[0]+500, mapy[-1]-500)
    

    This then gives:
    enter image description here

    The entire snippet, for completeness. It's probably possible to rewrite the Xarray part a little cleaner, but this stays close to the original code from the OP.

    import xarray as xr
    import matplotlib.pyplot as plt
    import cartopy.crs as ccrs
    from pathlib import Path
    import numpy as np
    
    data_dir = Path("D:\Temp\him8")
    
    ds1 = xr.open_dataset(data_dir / '20160814080000-P1S-ABOM_OBS_B01-PRJ_GEOS141_1000-HIMAWARI8-AHI.nc')
    ds2 = xr.open_dataset(data_dir / '20160814080000-P1S-ABOM_OBS_B02-PRJ_GEOS141_1000-HIMAWARI8-AHI.nc')
    ds3 = xr.open_dataset(data_dir / '20160814080000-P1S-ABOM_OBS_B03-PRJ_GEOS141_500-HIMAWARI8-AHI.nc')
    
    dx1 = ds1.isel(time=0, x=slice(3000, 4000, 1), y=slice(3000, 4000, 1))
    dx2 = ds2.isel(time=0, x=slice(3000, 4000, 1), y=slice(3000, 4000, 1))
    dx3 = ds3.isel(time=0, x=slice(6000, 8000, 1), y=slice(6000, 8000, 1))
    
    dx = xr.merge([dx1, dx2, dx3.interp_like(dx1)])
    
    central_longitude = dx1['geostationary'].longitude_of_projection_origin
    satellite_height = dx1['geostationary'].satellite_height
    
    mapx = dx1['x'].to_numpy()
    mapy = dx1['y'].to_numpy()
    
    mapx = (mapx[0]-500, mapx[-1]+500)
    mapy = (mapy[0]+500, mapy[-1]-500)
    
    rgb = np.stack((
        dx['channel_0003_scaled_radiance'].to_numpy(), # Band 3 is red (0.64 um)
        dx['channel_0002_scaled_radiance'].to_numpy(), # Band 2 is green (0.51 um
        dx['channel_0001_scaled_radiance'].to_numpy(), # Band 1 is blue (0.47 um)
    ), axis=-1)
    
    rgb_stretched = np.clip((rgb/0.85)**0.85, 0, 1)
    
    data_proj = ccrs.Geostationary(
        central_longitude=central_longitude, 
        satellite_height=satellite_height,
    )
    map_proj =  ccrs.Miller(central_longitude=central_longitude)
    
    fig, ax = plt.subplots(
        figsize=(10, 10), facecolor="w", dpi=100,
        subplot_kw=dict(projection=map_proj),
    )
    
    pcm = ax.pcolorfast(mapx, mapy, rgb_stretched, transform=data_proj)
    ax.coastlines(color="w")
    ax.set_extent((*mapx, *mapy), crs=data_proj)