I am having issues with the function that assigns y and x coordinates values for grid mapping. Essentially my end goal is to be able to extract grid values from a given set of lat/lon pairs, which seems to be way more complicated than it should be (and maybe my method isn't the best). I am using a standard RTMA grib file and opening with xarray and the cfgrib engine. The RTMA data does not have grid mapping attached to the data, so the CRS needs to be manually added. I have attempted to supply that by getting information directly from the grib file and confirming using pygrib. The issue is that it seems the x/y coordinate values don't align correctly with their respective lat/lon coordinates, so when trying to index for point extraction, you can be a few grid cells off (which can have big differences with precip gradients). The following is the work I've done so far:
rtma = xr.open_dataset("/rtma2p5.2022022023z.conus.grb2", engine="cfgrib")
print(rtma)
<xarray.Dataset>
Dimensions: (y: 1377, x: 2145)
Coordinates:
time datetime64[ns] ...
step timedelta64[ns] ...
heightAboveGround float64 ...
latitude (y, x) float64 ...
longitude (y, x) float64 ...
valid_time datetime64[ns] ...
surface float64 ...
Dimensions without coordinates: y, x
Data variables:
t2m (y, x) float32 ...
d2m (y, x) float32 ...
sp (y, x) float32 ...
vis (y, x) float32 ...
Attributes:
GRIB_edition: 2
GRIB_centre: kwbc
GRIB_centreDescription: US National Weather Service - NCEP
GRIB_subCentre: 4
Conventions: CF-1.7
institution: US National Weather Service - NCEP
history: 2022-04-19T15:21 GRIB to CDM+CF via cfgrib-0.9.1...
We have latitude and longitude coordinates, buy no x/y coordinate values, even after attempting to use metpy.parse_cf(). So, time to add the grid mapping manually.
Using pygrib we can confirm some of the CRS values
grib = pygrib.open("/rtma2p5.2022022023z.conus.grb2")
msg = grib.message(1)
CRS(msg.projparams).to_cf()
{'crs_wkt': 'PROJCRS["unknown",BASEGEOGCRS["unknown",DATUM["unknown",ELLIPSOID["unknown",6371200,0,LENGTHUNIT["metre",1,ID["EPSG",9001]]]],PRIMEM["Greenwich",0,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8901]]],CONVERSION["unknown",METHOD["Lambert Conic Conformal (2SP)",ID["EPSG",9802]],PARAMETER["Latitude of false origin",25,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8821]],PARAMETER["Longitude of false origin",265,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8822]],PARAMETER["Latitude of 1st standard parallel",25,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8823]],PARAMETER["Latitude of 2nd standard parallel",25,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8824]],PARAMETER["Easting at false origin",0,LENGTHUNIT["metre",1],ID["EPSG",8826]],PARAMETER["Northing at false origin",0,LENGTHUNIT["metre",1],ID["EPSG",8827]]],CS[Cartesian,2],AXIS["(E)",east,ORDER[1],LENGTHUNIT["metre",1,ID["EPSG",9001]]],AXIS["(N)",north,ORDER[2],LENGTHUNIT["metre",1,ID["EPSG",9001]]]]',
'semi_major_axis': 6371200.0,
'semi_minor_axis': 6371200.0,
'inverse_flattening': 0.0,
'reference_ellipsoid_name': 'unknown',
'longitude_of_prime_meridian': 0.0,
'prime_meridian_name': 'Greenwich',
'geographic_crs_name': 'unknown',
'horizontal_datum_name': 'unknown',
'projected_crs_name': 'unknown',
'grid_mapping_name': 'lambert_conformal_conic',
'standard_parallel': (25.0, 25.0),
'latitude_of_projection_origin': 25.0,
'longitude_of_central_meridian': 265.0,
'false_easting': 0.0,
'false_northing': 0.0}
Great, now we'll assign the grid mapping to get x/y coordinate values. We'll use the attribute data from the grib file to add this since we confirmed it looks correct using pygrib
# Assign the grid mapping
grid = rt.metpy.assign_crs({
"semi_major_axis": 6371200.0,
"semi_minor_axis": 6371200.0,
"grid_mapping_name": "lambert_conformal_conic",
"standard_parallel": [
rt["t2m"].attrs["GRIB_Latin1InDegrees"],
rt["t2m"].attrs["GRIB_Latin2InDegrees"]
],
"latitude_of_projection_origin": rt["t2m"].attrs["GRIB_LaDInDegrees"],
"longitude_of_central_meridian": rt["t2m"].attrs["GRIB_LoVInDegrees"],
}).metpy.assign_y_x()
print(grid)
<xarray.Dataset>
Dimensions: (y: 1377, x: 2145)
Coordinates:
time datetime64[ns] ...
step timedelta64[ns] ...
heightAboveGround float64 ...
latitude (y, x) float64 20.19 20.2 20.2 ... 50.12 50.11 50.11
longitude (y, x) float64 238.4 238.5 238.5 ... 299.1 299.1 299.1
valid_time datetime64[ns] ...
surface float64 ...
metpy_crs object Projection: lambert_conformal_conic
* y (y) float64 -2.638e+05 -2.612e+05 ... 3.228e+06 3.231e+06
* x (x) float64 -2.763e+06 -2.761e+06 ... 2.679e+06 2.682e+06
Data variables:
t2m (y, x) float32 ...
d2m (y, x) float32 ...
sp (y, x) float32 ...
vis (y, x) float32 ...
Attributes:
GRIB_edition: 2
GRIB_centre: kwbc
GRIB_centreDescription: US National Weather Service - NCEP
GRIB_subCentre: 4
Conventions: CF-1.7
institution: US National Weather Service - NCEP
history: 2022-04-19T15:21 GRIB to CDM+CF via cfgrib-0.9.1...
The issue now comes when trying to extract values from the lat/lon pairs. I can convert an example lat/lon pair to coordinates in the grid projection:
grid_crs = ccrs.LambertConformal(central_longitude = grid["t2m"].attrs["GRIB_LoVInDegrees"],
central_latitude = grid["t2m"].attrs["GRIB_LaDInDegrees"],
standard_parallels = [
grid["t2m"].attrs["GRIB_Latin1InDegrees"],
grid["t2m"].attrs["GRIB_Latin2InDegrees"]
]
)
lat = 35.36234
lon = 283.28369
x_t, y_t = grid_crs.transform_point(lon, lat, src_crs=ccrs.PlateCarree())
and then use either
grid.sel(x=x_t, y=y_t, method='nearest')
or
grid.interp(x=x_t, y=y_t, method='nearest')
however I was noticing the values were a little off. After ton of trial and error and plotting the values where either the interpolation or nearest neighbor was done, I noticed the points were off. So I went back and plotted the edge value (first x/y coordinate value pair in the grib file) and noticed it wasn't aligned quite right.
print(grid.coords["x"][0].values)
-2763204.4992319937
print(grid.coords["y"][0].values)
-263789.4687076026
After plotting the pcolormesh of the grid as well as the lat/lon values in the grid, you can see that the first x/y coordinate point (red) does not match up with the bottom left (blue) lat/lon pair. This causes erroneous values all across the grid when trying to index or interpolate for point values.
Finally, this image has a lot going on, but hopefully it helps you understand. Plotted here is the pcolormesh of the t2m field for example. Light blue dots are the lat/lon coordinates from the file. The black ring is the requested location of where I'm trying to extract data from. The purple dots are a 3x3 grid of lat/lon coordinates surrounding the station (I found the nearest index to the station using a method described here: https://stackoverflow.com/a/58774123/13324961). The orange dots are the same 3x3 grid, but plotted using the x/y coordinate value pairs. This shows you that the x/y coordinate value pairs, when plotted, do not correspond to the same latitude and longitude. When this happens, you run into issues indexing points. When you run .sel with the nearest method, you'll see it grabs the nearest x/y coordinate pair (bright green dot overlapping the orange dot) which corresponds to a lat/lon pair much further away (solid bright green dot). Same can be see when trying to interpolate; it can correctly locate the station (pink dot overlapping the black dot), but its corresponding lat/lon value again is further away and giving you a different value.
Hopefully that makes sense, its somewhat difficult to explain, but hopefully that last image helped. Is there any way to remedy this? Or perhaps a better method to extract point values?
The most likely cause of this shift is due to differing "globes" or "ellipsoids" between the CRS used in the dataset (for given latitude/longitudes and calculated y/x) and that used to calculate projection y/x pairs from a desired latitude/longitude pair. In particular, note that the dataset's CRS was constructed with the following:
"semi_major_axis": 6371200.0,
"semi_minor_axis": 6371200.0,
but no ellipsoid arguments were provided when constructing grid_crs
using ccrs.LambertConformal()
, which means that the default ellipsoid (wgs84) was used instead.
Luckily, MetPy has utilities for handling the CRS in the dataset that should make it easier to keep the CRS and globe consistent. Instead of constructing the grid CRS yourself, and leaving the lat/lon CRS unspecified, I would recommend using MetPy's accessor:
grid_crs = grid["t2m"].metpy.cartopy_crs
latlon_crs = ccrs.PlateCarree(globe=grid["t2m"].metpy.cartopy_globe)
x_t, y_t = grid_crs.transform_point(lon, lat, src_crs=latlon_crs)