pythonpython-xarraydatashader

How to troubleshoot xarray.open_rasterio parsing?


I want to rasterize smaller shades of high-resolution GeoTIFFs. xarray.open_rasterio seemed the right tool to get what datashader.transfer_functions.shade would need. However, the returned DataArray also has a band, which trips up shade. Several questions arise:

  1. Should xarray.open_rasterio return the values currently in "band" simply as the values in the array?
  2. How can one check whether the GeoTIFF looks like what xarray.open_rasterio expects?
  3. Would any argument of xarray.open_rasterio allow the specification of "band" as the "value"?
  4. Or should xarray.open_rasterio simply reorder or relabel the coordinates in a way such that "band" is the third one (after "x" and "y")?
  5. Or if xarray.open_rasterio parsed this GeoTIFF correctly, could one call shade in a way it does not confuse this 2D array with a 3D array?

MRE: Use a GeoTIFF from Facebook's high-resolution population maps, for instance, from here. The code below could put this on a 800x800 map. Instead, after I finally understood why shade complained it had only (the default) 22 colors in its color_key argument while it was trying to color 800 categories, I understood the y coordinate is understood by shade to be the value. I show the array below.

import rasterio
from rasterio.mask import mask
import os
import datashader as ds
from datashader import transfer_functions as tf
import xarray as xr
from matplotlib.cm import viridis

data_path = 'SOME_PATH/'
file_name = 'HUN_women_of_reproductive_age_15_49.tif'  # reproductive women, e.g.
file_path = os.path.join(data_path, file_name)

da = xr.open_rasterio(file_path)

cvs = ds.Canvas(plot_width=800, plot_height=800)

img = tf.shade(cvs.raster(da), cmap=viridis)

This fails because the da array looks like this:

<xarray.DataArray (band: 1, y: 10240, x: 24320)> array([[[nan, nan, ..., nan, nan],
        [nan, nan, ..., nan, nan],
        ...,
        [nan, nan, ..., nan, nan],
        [nan, nan, ..., nan, nan]]]) Coordinates:   * band     (band) int64 1   * y        (y) float64 48.59 48.59 48.59 48.59 ... 45.75
45.75 45.75 45.75   * x        (x) float64 16.13 16.14 16.14 16.14 ... 22.89 22.89 22.89 22.89 Attributes:
    transform:      (0.000277777777778, 0.0, 16.13486111111111, 0.0, -0.00027...
    crs:            +init=epsg:4326
    res:            (0.000277777777778, 0.000277777777778)
    is_tiled:       1
    nodatavals:     (nan,)
    scales:         (1.0,)
    offsets:        (0.0,)
    descriptions:   ('Population Count',)
    AREA_OR_POINT:  Area

Solution

  • cvs.raster() accepts a layer argument to specify which of the provided bands you want to rasterize; maybe that will help?

    img = tf.shade(cvs.raster(da,layer=1), cmap=viridis)
    

    In any case, note that datashader.transfer_functions.shade does not rasterize its input; that's done by a call to Canvas (specifically cvs.raster, here). shade just converts an already rasterized array into colored pixels.