python-3.xpandaspyvizmosaic-plotpanel-pyviz

How to Successfully Produce Mosaic Plots in Pyviz Panel Apps?


I have created the following dataframe df:

Setup:

import pandas as pd
import numpy as np
import random
import copy
import feather
import matplotlib.pyplot as plt
from statsmodels.graphics.mosaicplot import mosaic
import plotly.graph_objects as go
import plotly.express as px
import panel as pn
import holoviews as hv
import geoviews as gv
import geoviews.feature as gf
import cartopy
import cartopy.feature as cf
from geoviews import opts
from cartopy import crs as ccrs
import hvplot.pandas
import colorcet as cc
from colorcet.plotting import swatch
#pn.extension() # commented out as this causes an intermittent javascript error
gv.extension("bokeh")
cols = {"name":["Jim","Alice","Bob","Julia","Fern","Bill","Jordan","Pip","Shelly","Mimi"], 
         "age":[19,26,37,45,56,71,20,36,37,55], 
         "sex":["Male","Female","Male","Female","Female","Male","Male","Male","Female","Female"],
         "age_band":["18-24","25-34","35-44","45-54","55-64","65-74","18-24","35-44","35-44","55-64"],
         "insurance_renew_month":[1,2,3,3,3,4,5,5,6,7],
         "postcode_prefix":["EH","M","G","EH","EH","M","G","EH","M","EH"],
         "postcode_order":[3,2,1,3,3,2,1,3,2,3],
         "local_authority_district":["S12000036","E08000003","S12000049","S12000036","S12000036","E08000003","S12000036","E08000003","S12000049","S12000036"],
         "blah1":[3,None,None,8,8,None,1,None,None,None],
         "blah2":[None,None,None,33,5,None,66,3,22,3],
         "blah3":["A",None,"A",None,"C",None,None,None,None,None],
         "blah4":[None,None,None,None,None,None,None,None,None,1]}
df = pd.DataFrame.from_dict(cols)
df
Out[2]: 
     name  age     sex age_band  ...  blah1 blah2  blah3 blah4
0     Jim   19    Male    18-24  ...    3.0   NaN      A   NaN
1   Alice   26  Female    25-34  ...    NaN   NaN   None   NaN
2     Bob   37    Male    35-44  ...    NaN   NaN      A   NaN
3   Julia   45  Female    45-54  ...    8.0  33.0   None   NaN
4    Fern   56  Female    55-64  ...    8.0   5.0      C   NaN
5    Bill   71    Male    65-74  ...    NaN   NaN   None   NaN
6  Jordan   20    Male    18-24  ...    1.0  66.0   None   NaN
7     Pip   36    Male    35-44  ...    NaN   3.0   None   NaN
8  Shelly   37  Female    35-44  ...    NaN  22.0   None   NaN
9    Mimi   55  Female    55-64  ...    NaN   3.0   None   1.0

[10 rows x 12 columns]
df[["sex","age_band","postcode_prefix"]] = df[["sex","age_band","postcode_prefix"]].astype("category")
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 12 columns):
name                        10 non-null object
age                         10 non-null int64
sex                         10 non-null category
age_band                    10 non-null category
insurance_renew_month       10 non-null int64
postcode_prefix             10 non-null category
postcode_order              10 non-null int64
local_authority_district    10 non-null object
blah1                       4 non-null float64
blah2                       6 non-null float64
blah3                       3 non-null object
blah4                       1 non-null float64
dtypes: category(3), float64(3), int64(3), object(3)
memory usage: 1.3+ KB

The Problem:

I can successfully create a mosaic plot with the following code:

fig,ax = plt.subplots(figsize=(15,10))
mosaic(df,["sex", "age_band"],ax=ax);

enter image description here

However, I am having issues when I try to create a corresponding app using pn.interact:

categoric_cols = df.select_dtypes(include="category")
cat_atts = categoric_cols.columns.tolist()
cat_atts
Out[4]: ['sex', 'age_band', 'postcode_prefix']
def bivar_cat(x="sex",y="age_band"):
    if x in cat_atts and y in cat_atts:
        fig,ax = plt.subplots(figsize=(15,10))
        return mosaic(df,[x,y],ax=ax);

app_df_cat = pn.interact(bivar_cat,x=cat_atts,y=cat_atts)
app_df_cat

Which results in the following:

enter image description here

The above rendered mosaic plot seems to correspond to the default values of x & y (ie sex & age_band). When you select a new attribute for x or y from the dropdowns, the text above the mosaic plot changes (this text seems to be a string representation of the plot) however the mosaic plot itself does not.

Is my issue possibly related to having to comment out pn.extension()? I have found that when pn.extension() is not commented out, it results in an intermittent javascript error whereby sometimes there is no error raised, sometimes there is an error but my panel app still loads and sometimes there is an error and it crashes my browser. (I have omitted the javascript error here as it can be very large - if it is helpful I can add this to my post.) I would say that the error is raised significantly more often than it is not.

Strangely enough, I haven't observed any difference in other apps that I have created where I have omitted pn.extension() vs including it. However as the documentation always specifies that you include it, I would have expected that I would have to set my appropriate extensions for all my plots to work correctly? (I have plotly, hvplot, holoviews and geoviews plots successfully plotting in these other apps with and without pn.extension() and pn.extension("plotly") included).

Is it possible to produce panel apps based on mosaic plots?

Thanks


Software Info:

os x                      Catalina 
browser                   Firefox
python                    3.7.5
notebook                  6.0.2 
pandas                    0.25.3
panel                     0.7.0
plotly                    4.3.0 
plotly_express            0.4.1 
holoviews                 1.12.6
geoviews                  1.6.5 
hvplot                    0.5.2 

Solution

  • Statsmodels function mosaic() returns a tuple with a figure and rects.

    What you're seeing now via interact is that tuple. This tuple also gets updated in your code when you use the dropdowns.

    The figure you see below that is the figure that jupyter automatically plots one time. This one doesn't get updated.

    The solution is two-fold:
    1) only return the figure, not the tuple
    2) prevent jupyter from automatically plotting your figure once with plt.close()

    In code:

    def bivar_cat(x='sex', y='age_band'):
        fig, ax = plt.subplots(figsize=(15,10))
        mosaic(df, [x,y], ax=ax)
        plt.close()
        return fig
    
    app_df_cat = pn.interact(
        bivar_cat, 
        x=cat_atts, 
        y=cat_atts,
    )
    
    app_df_cat