pythonjsonplotlygeojsongeopandas

Plotly hexbin cutoff within specified json boundary


I'm plotting a separate hexbin figure and json boundary file. The hexbin grid overlaps the boundary file though. I'm interested in displaying the African continent only. I'm aiming to cut-off or subset the hexbin grid within the African continent. So no grid square should be visualised outside the boundary file. Is there a way to achieve this using Plotly?

import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objs as go
import plotly.figure_factory as ff
import geopandas as gpd
import json

data = pd.DataFrame({
    'LAT': [1,5,6,7,5,6,7,5,6,7,5,6,7,12,-40,50],
    'LON': [10,10,11,12,10,11,12,10,11,12,10,11,12,-20,40,50],
    })

gdf_poly = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
gdf_poly = gdf_poly.drop('name', axis = 1)

Afr_gdf_area = gdf_poly[gdf_poly['continent'] == 'Africa'].reset_index(drop = True)

fig = ff.create_hexbin_mapbox(data_frame=data,
                       lat="LAT", 
                       lon="LON",
                       nx_hexagon=25,
                       opacity=0.4,
                       labels={"color": "Point Count"},
                       mapbox_style='carto-positron',
                       zoom = 1
                       )

fig.update_layout(mapbox={
        "layers": [
            {"source": json.loads(Afr_gdf_area.geometry.to_json()),
                "below": "traces",
                "type": "fill",
                "color": "orange",
                "opacity" : 0.1,
                "line": {"width": 1}
            },
        ],
    })   

fig.show()

Intended output is to cut-off or clip squares outside the African continent, which is in orange.

enter image description here


Solution

  • If you look inside fig.data[0], it's a Choroplethmapbox with several fields including customdata and geojson. The geojson contains all of the information that plotly needs to draw the hexbins, including the coordinates and unique id for each hexagon. The customdata is an array of shape [n_hexbins x 3] where each element of the array includes the id and the numeric values that plotly uses to determine the color of each hexbin.

    'customdata': array([[0.0, '-0.3490658516205964,-0.7648749219440846', 0],
                             [0.0, '-0.3490658516205964,-0.6802309514438665', 0],
                             [0.0, '-0.3490658516205964,-0.5955869809436484', 0],
                             ...,
                             [0.0, '0.8482300176421051,0.8010385323099501', 0],
                             [0.0, '0.8482300176421051,0.8856825028101681', 0],
                             [0.0, '0.8482300176421051,0.9703264733103861', 0]], dtype=object),
        'geojson': {'features': [{'geometry': {'coordinates': [[[-20.00000007,
                                                               -41.31174966478728],
                                                               [-18.6000000672,
                                                               -40.70179509236059],
                                                               [-18.6000000672,
                                                               -39.464994178287064],
                                                               [-20.00000007,
                                                               -38.838189880150665],
                                                               [-21.4000000728,
                                                               -39.464994178287064],
                                                               [-21.4000000728,
                                                               -40.70179509236059],
                                                               [-20.00000007,
                                                               -41.31174966478728]]],
                                               'type': 'Polygon'},
                                  'id': '-0.3490658516205964,-0.7648749219440846',
                                  'type': 'Feature'},
                                 {'geometry': {'coordinates': [[[-20.00000007,
                                                               -37.56790013078226],
                                                               [-18.6000000672,
                                                               -36.924474103794715],
                                                               [-18.6000000672,
                                                               -35.62123099996148],
                                                               [-20.00000007,
                                                               -34.96149172026768],
                                                               [-21.4000000728,
                                                               -35.62123099996148],
                                                               [-21.4000000728,
                                                               -36.924474103794715],
                                                               [-20.00000007,
                                                               -37.56790013078226]]],
                                               'type': 'Polygon'},
                                  'id': '-0.3490658516205964,-0.6802309514438665',
                                  'type': 'Feature'},
                                 {'geometry': {'coordinates
    ...
    

    To select the hexbins within the specified boundary, we can start by extracting the information from customdata and geojson within the fig.data[0] generated by plotly, and create a geopandas dataframe. Then we can create a new geopandas dataframe called hexbins_in_afr which is an inner join between our new gdf of hexbins and Afr_gdf_area (so that we are dropping all hexbins outside of Afr_gdf_area).

    After we extract the geojson information from hexbins_in_afr as well as the customdata, we can explicitly set the following fields within fig.data[0]:

    fig.data[0]['geojson']['features'] = new_geojson
    fig.data[0]['customdata'] = hexbins_in_afr['customdata']
    

    Here is the code with the necessary modifications:

    import numpy as np
    import pandas as pd
    import plotly.express as px
    import plotly.graph_objs as go
    import plotly.figure_factory as ff
    import geopandas as gpd
    from geopandas.tools import sjoin
    from shapely.geometry import Polygon
    import json
    
    
    data = pd.DataFrame({
        'LAT': [1,5,6,7,5,6,7,5,6,7,5,6,7,12,-40,50],
        'LON': [10,10,11,12,10,11,12,10,11,12,10,11,12,-20,40,50],
        })
    
    gdf_poly = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
    gdf_poly = gdf_poly.drop('name', axis = 1)
    
    Afr_gdf_area = gdf_poly[gdf_poly['continent'] == 'Africa'].reset_index(drop = True)
    
    fig = ff.create_hexbin_mapbox(data_frame=data,
                           lat="LAT", 
                           lon="LON",
                           nx_hexagon=25,
                           opacity=0.4,
                           labels={"color": "Point Count"},
                           mapbox_style='carto-positron',
                           zoom = 1
                           )
    
    gdf = gpd.GeoDataFrame({
        'customdata': fig.data[0]['customdata'].tolist(),
        'id':[item['id'] for item in fig.data[0]['geojson']['features']],
        'geometry':[Polygon(item['geometry']['coordinates'][0]) for item in fig.data[0]['geojson']['features']]
    })
    gdf.set_crs(epsg=4326, inplace=True)
    
    hexbins_in_afr = sjoin(gdf, Afr_gdf_area, how='inner')
    
    def get_coordinates(polygon):
        return [[list(i) for i in polygon.exterior.coords]]
    
    hexbins_in_afr['coordinates'] = hexbins_in_afr['geometry'].apply(lambda x: get_coordinates(x))
    
    ## create a new geojson that matches the structure of fig.data[0]['geojson']['features']
    new_geojson = [{
        'type': 'Feature', 
        'id': id, 
        'geometry': {
            'type': 'Polygon', 
            'coordinates': coordinate
        }
    } for id, coordinate in zip(hexbins_in_afr['id'],hexbins_in_afr['coordinates'])]
    
    fig.data[0]['geojson']['features'] = new_geojson
    fig.data[0]['customdata'] = hexbins_in_afr['customdata']
    
    fig.update_layout(mapbox={
            "layers": [
                {"source": json.loads(Afr_gdf_area.geometry.to_json()),
                    "below": "traces",
                    "type": "fill",
                    "color": "orange",
                    "opacity" : 0.1,
                    "line": {"width": 1}
                },
            ],
        })   
    
    fig.show()
    

    enter image description here