pythonplotlysankey-diagramholoviews

Sankey Plot not Showing in Jupyter Notebook


I'm pretty sure my code is fine, btu I can't generate a plot of a simple Sankey Chart. Maybe something is off with the code, not sure. Here's what I have now. Can anyone see a problem with this?

import pandas as pd
import holoviews as hv
import plotly.graph_objects as go
import plotly.express as pex
hv.extension('bokeh')

data = [['TMD','TMD Create','Sub-Section 1',17],['TMD','TMD Create','Sub-Section 1',17],['C4C','Customer Tab','Sub-Section 1',10],['C4C','Customer Tab','Sub-Section 1',10],['C4C','Customer Tab','Sub-Section 1',17]]
df = pd.DataFrame(data, columns=['Source','Target','Attribute','Value'])
df

source = df["Source"].values.tolist()
target = df["Target"].values.tolist()
value = df["Value"].values.tolist()
labels = df["Attribute"].values.tolist()

import plotly.graph_objs as go

#create links
link = dict(source=source, target=target, value=value, 
color=["turquoise","tomato"] * len(source))

#create nodes
node = dict(label=labels, pad=15, thickness=5)

#create a sankey object
chart = go.Sankey(link=link, node=node, arrangement="snap")

#build a figure
fig = go.Figure(chart)
fig.show()

I am trying to follow the basic example shown in the link below.

https://python.plainenglish.io/create-a-sankey-diagram-in-python-e09e23cb1a75


Solution

  • You are mentioning two different packages, and both need different solutions. I don't know which you perefer, so I explain both.

    Data

    import pandas as pd
    df = pd.DataFrame({
        'Source':['a','a','b','b'],
        'Target':['c','d','c','d'],
        'Value': [1,2,3,4]
    })
    >>> df
      Source Target  Value
    0      a      c      1
    1      a      d      2
    2      b      c      3
    3      b      d      4
    

    This is a very basic DataFrame with only 4 transitions.

    Holoviews/Bokeh

    With holoviews it is very easy to plot a sanky diagram, because it takes the DataFrame as it is and gets the labels by the letters in the Source and Target column.

    import holoviews as hv
    hv.extension('bokeh')
    
    sankey = hv.Sankey(df)
    sankey.opts(width=600, height=400)
    

    This is created with holoviews 1.15.4 and bokeh 2.4.3.

    sanky with holoviews

    Plotly

    For plotly it is not so easy, because plotly wants numbers instead of letters in the Source and Target column. Therefor we have to manipulate the DataFrame first before we can create the figure.

    Here I collect all different labels and replace them by a unique number.

    unique_labels = set(list(df['Source'].unique()) + list(df['Target'].unique()))
    mapper = {v: i for i, v in enumerate(unique_labels)}
    df['Source'] = df['Source'].map(mapper)
    df['Target'] = df['Target'].map(mapper
    >>> df
       Source  Target  Value
    0       0       2      1
    1       0       3      2
    2       1       2      3
    3       1       3      4
    

    Afterwards I can create the dicts which plotly takes. I have to set the lables by hand and the length of the arrays have to match.

    source = df["Source"].values.tolist()
    target = df["Target"].values.tolist()
    value = df["Value"].values.tolist()
    
    #create links
    link = dict(source=source, target=target, value=value, color=["turquoise","tomato"] * 2)
    
    #create nodes
    node = dict(label=['a', 'b', 'c', 'd'], pad=15, thickness=5)
    
    #create a sankey object
    chart = go.Sankey(link=link, node=node, arrangement="snap")
    
    #build a figure
    fig = go.Figure(chart)
    fig.show()
    

    I used plotly 5.13.0.

    sanky with ploty