[SOLVED] Repetition of key data in plotly

Repetition of key data in plotly

I have the following input data format:

CustomerRef	London	Edinburgh	Cardiff
0001	A	A	A
0002	D	D	B
0003	B	A	A
0004	D	D	B
0005	A	C	A
0006	D	D	B
0007	A	D	A
0008	D	C	B

I want to generate a chart like this, with one column on the chart per City column in my data, stacked so I can see the totals of A,B,C & D in a single column :

I've only been able to find examples which merge multiple columns into a single chart, or which take an input which would require me to transform the data significantly (e.g. https://plotly.com/python/bar-charts/#stacked-bar-chart where I would need to provide it a series of values of 'A', then 'B', etc), or do some kind of counting beforehand (e.g. https://plotly.com/python/histograms/#specify-aggregation-function or https://plotly.com/python/bar-charts/#bar-charts-with-long-format-data)

Here's my code:

import pandas as pd
import plotly.graph_objects as go
from dash import Dash, html, dash_table, dcc, callback, Output, Input


df = pd.read_csv("sentiment.csv")

#generate unique list of values for London
london_values = df['london'].unique()

#name the London column
london_col_name = ['London']
edi_col_name = ['Edinburgh']
cardiff_col_name = ['Cardiff']

bars = []
for res in df['london'].unique():
    bar = go.Bar(name=res, x=london_col_name, y=df[df['london'] == res])
    bars.append(bar)

for res in df['edinburgh'].unique():
    bar = go.Bar(name=res, x=edi_col_name, y=df[df['edinburgh'] == res])
    bars.append(bar)
    
for res in df['cardiff'].unique():
    bar = go.Bar(name=res, x=cardiff_col_name, y=df[df['cardiff'] == res])
    bars.append(bar)

fig = go.Figure(data=bars)
fig.update_layout(barmode='stack', title='Results By Location')

app = Dash()
app.layout = html.Div([
    dcc.Graph(figure=fig)
])

if __name__ == '__main__': 
    app.run(debug=True)

I'm seeing repetition of the outputs in the key ("ABDCADCADCA" etc). I would like to know how to get the key to apply across all the inputs please.

Thanks!

Solution

You need to create one Bar trace per category. A simple approach is to count their occurrences for each city beforehand :

categories = ['A', 'B', 'C', 'D']
cities = ['London', 'Edinburgh', 'Cardiff']
occurrences = {city: df[city].value_counts() for city in cities}

bars = []
for cat in categories:
    count = [occurrences[city].get(cat, default=0) for city in cities]
    bar = go.Bar(name=cat, x=cities, y=count)
    bars.append(bar)

fig = go.Figure(data=bars)
fig.update_layout(barmode='stack', title='Results By Location')

fig.show()