I'm working on Python with a dataset that has data about a numerical variable for each italian region, like this:
import numpy as np
import pandas as pd
regions = ['Trentino Alto Adige', "Valle d'Aosta", 'Veneto', 'Lombardia', 'Emilia-Romagna', 'Toscana', 'Friuli-Venezia Giulia', 'Liguria', 'Piemonte', 'Marche', 'Lazio', 'Umbria', 'Abruzzo', 'Sardegna', 'Puglia', 'Molise', 'Basilicata', 'Calabria', 'Sicilia', 'Campania']
df = pd.DataFrame([regions,[10+(i/2) for i in range(20)]]).transpose()
df.columns = ['region','quantity']
df.head()
I would like to generate a map of Italy in which the colour of the different regions depends on the numeric values of the variable quantity (df['quantity']
),i.e., a choropleth map like this:
How can I do it?
You can use geopandas.
The regions in your df compared to the geojson dont match exactly. I'm sure you can find another one, or alter the names so they match.
import pandas as pd
import geopandas as gpd
regions = ['Trentino Alto Adige', "Valle d'Aosta", 'Veneto', 'Lombardia', 'Emilia-Romagna', 'Toscana', 'Friuli-Venezia Giulia', 'Liguria', 'Piemonte', 'Marche', 'Lazio', 'Umbria', 'Abruzzo', 'Sardegna', 'Puglia', 'Molise', 'Basilicata', 'Calabria', 'Sicilia', 'Campania']
df = pd.DataFrame([regions,[10+(i/2) for i in range(20)]]).transpose()
df.columns = ['region','quantity']
#Download a geojson of the region geometries
gdf = gpd.read_file(filename=r'https://raw.githubusercontent.com/openpolis/geojson-italy/master/geojson/limits_IT_municipalities.geojson')
gdf = gdf.dissolve(by='reg_name') #The geojson is to detailed, dissolve boundaries by reg_name attribute
gdf = gdf.reset_index()
#gdf.reg_name[~gdf.reg_name.isin(regions)] Two regions are missing in your df
#16 Trentino-Alto Adige/Südtirol
#18 Valle d'Aosta/Vallée d'Aoste
gdf = pd.merge(left=gdf, right=df, how='left', left_on='reg_name', right_on='region')
ax = gdf.plot(
column="quantity",
legend=True,
figsize=(15, 10),
cmap='OrRd',
missing_kwds={'color': 'lightgrey'});
ax.set_axis_off();