I'm pretty sure my code is fine, btu I can't generate a plot of a simple Sankey Chart. Maybe something is off with the code, not sure. Here's what I have now. Can anyone see a problem with this?
import pandas as pd
import holoviews as hv
import plotly.graph_objects as go
import plotly.express as pex
hv.extension('bokeh')
data = [['TMD','TMD Create','Sub-Section 1',17],['TMD','TMD Create','Sub-Section 1',17],['C4C','Customer Tab','Sub-Section 1',10],['C4C','Customer Tab','Sub-Section 1',10],['C4C','Customer Tab','Sub-Section 1',17]]
df = pd.DataFrame(data, columns=['Source','Target','Attribute','Value'])
df
source = df["Source"].values.tolist()
target = df["Target"].values.tolist()
value = df["Value"].values.tolist()
labels = df["Attribute"].values.tolist()
import plotly.graph_objs as go
#create links
link = dict(source=source, target=target, value=value,
color=["turquoise","tomato"] * len(source))
#create nodes
node = dict(label=labels, pad=15, thickness=5)
#create a sankey object
chart = go.Sankey(link=link, node=node, arrangement="snap")
#build a figure
fig = go.Figure(chart)
fig.show()
I am trying to follow the basic example shown in the link below.
https://python.plainenglish.io/create-a-sankey-diagram-in-python-e09e23cb1a75
You are mentioning two different packages, and both need different solutions. I don't know which you perefer, so I explain both.
import pandas as pd
df = pd.DataFrame({
'Source':['a','a','b','b'],
'Target':['c','d','c','d'],
'Value': [1,2,3,4]
})
>>> df
Source Target Value
0 a c 1
1 a d 2
2 b c 3
3 b d 4
This is a very basic DataFrame with only 4 transitions.
With holoviews it is very easy to plot a sanky diagram, because it takes the DataFrame as it is and gets the labels by the letters in the Source
and Target
column.
import holoviews as hv
hv.extension('bokeh')
sankey = hv.Sankey(df)
sankey.opts(width=600, height=400)
This is created with holoviews 1.15.4 and bokeh 2.4.3.
For plotly it is not so easy, because plotly wants numbers instead of letters in the Source
and Target
column. Therefor we have to manipulate the DataFrame first before we can create the figure.
Here I collect all different labels and replace them by a unique number.
unique_labels = set(list(df['Source'].unique()) + list(df['Target'].unique()))
mapper = {v: i for i, v in enumerate(unique_labels)}
df['Source'] = df['Source'].map(mapper)
df['Target'] = df['Target'].map(mapper
>>> df
Source Target Value
0 0 2 1
1 0 3 2
2 1 2 3
3 1 3 4
Afterwards I can create the dicts which plotly takes. I have to set the lables by hand and the length of the arrays have to match.
source = df["Source"].values.tolist()
target = df["Target"].values.tolist()
value = df["Value"].values.tolist()
#create links
link = dict(source=source, target=target, value=value, color=["turquoise","tomato"] * 2)
#create nodes
node = dict(label=['a', 'b', 'c', 'd'], pad=15, thickness=5)
#create a sankey object
chart = go.Sankey(link=link, node=node, arrangement="snap")
#build a figure
fig = go.Figure(chart)
fig.show()
I used plotly 5.13.0.