pythonplotly

Repetition of key data in plotly


I have the following input data format:

CustomerRef London Edinburgh Cardiff
0001 A A A
0002 D D B
0003 B A A
0004 D D B
0005 A C A
0006 D D B
0007 A D A
0008 D C B

I want to generate a chart like this, with one column on the chart per City column in my data, stacked so I can see the totals of A,B,C & D in a single column : Stacked bar chart

I've only been able to find examples which merge multiple columns into a single chart, or which take an input which would require me to transform the data significantly (e.g. https://plotly.com/python/bar-charts/#stacked-bar-chart where I would need to provide it a series of values of 'A', then 'B', etc), or do some kind of counting beforehand (e.g. https://plotly.com/python/histograms/#specify-aggregation-function or https://plotly.com/python/bar-charts/#bar-charts-with-long-format-data)

Here's my code:

import pandas as pd
import plotly.graph_objects as go
from dash import Dash, html, dash_table, dcc, callback, Output, Input


df = pd.read_csv("sentiment.csv")

#generate unique list of values for London
london_values = df['london'].unique()

#name the London column
london_col_name = ['London']
edi_col_name = ['Edinburgh']
cardiff_col_name = ['Cardiff']

bars = []
for res in df['london'].unique():
    bar = go.Bar(name=res, x=london_col_name, y=df[df['london'] == res])
    bars.append(bar)

for res in df['edinburgh'].unique():
    bar = go.Bar(name=res, x=edi_col_name, y=df[df['edinburgh'] == res])
    bars.append(bar)
    
for res in df['cardiff'].unique():
    bar = go.Bar(name=res, x=cardiff_col_name, y=df[df['cardiff'] == res])
    bars.append(bar)

fig = go.Figure(data=bars)
fig.update_layout(barmode='stack', title='Results By Location')

app = Dash()
app.layout = html.Div([
    dcc.Graph(figure=fig)
])

if __name__ == '__main__': 
    app.run(debug=True)

I'm seeing repetition of the outputs in the key ("ABDCADCADCA" etc). I would like to know how to get the key to apply across all the inputs please.

Thanks!


Solution

  • You need to create one Bar trace per category. A simple approach is to count their occurrences for each city beforehand :

    categories = ['A', 'B', 'C', 'D']
    cities = ['London', 'Edinburgh', 'Cardiff']
    occurrences = {city: df[city].value_counts() for city in cities}
    
    bars = []
    for cat in categories:
        count = [occurrences[city].get(cat, default=0) for city in cities]
        bar = go.Bar(name=cat, x=cities, y=count)
        bars.append(bar)
    
    fig = go.Figure(data=bars)
    fig.update_layout(barmode='stack', title='Results By Location')
    
    fig.show()