I have the following input data format:
CustomerRef | London | Edinburgh | Cardiff |
---|---|---|---|
0001 | A | A | A |
0002 | D | D | B |
0003 | B | A | A |
0004 | D | D | B |
0005 | A | C | A |
0006 | D | D | B |
0007 | A | D | A |
0008 | D | C | B |
I want to generate a chart like this, with one column on the chart per City column in my data, stacked so I can see the totals of A,B,C & D in a single column :
I've only been able to find examples which merge multiple columns into a single chart, or which take an input which would require me to transform the data significantly (e.g. https://plotly.com/python/bar-charts/#stacked-bar-chart where I would need to provide it a series of values of 'A', then 'B', etc), or do some kind of counting beforehand (e.g. https://plotly.com/python/histograms/#specify-aggregation-function or https://plotly.com/python/bar-charts/#bar-charts-with-long-format-data)
Here's my code:
import pandas as pd
import plotly.graph_objects as go
from dash import Dash, html, dash_table, dcc, callback, Output, Input
df = pd.read_csv("sentiment.csv")
#generate unique list of values for London
london_values = df['london'].unique()
#name the London column
london_col_name = ['London']
edi_col_name = ['Edinburgh']
cardiff_col_name = ['Cardiff']
bars = []
for res in df['london'].unique():
bar = go.Bar(name=res, x=london_col_name, y=df[df['london'] == res])
bars.append(bar)
for res in df['edinburgh'].unique():
bar = go.Bar(name=res, x=edi_col_name, y=df[df['edinburgh'] == res])
bars.append(bar)
for res in df['cardiff'].unique():
bar = go.Bar(name=res, x=cardiff_col_name, y=df[df['cardiff'] == res])
bars.append(bar)
fig = go.Figure(data=bars)
fig.update_layout(barmode='stack', title='Results By Location')
app = Dash()
app.layout = html.Div([
dcc.Graph(figure=fig)
])
if __name__ == '__main__':
app.run(debug=True)
I'm seeing repetition of the outputs in the key ("ABDCADCADCA" etc). I would like to know how to get the key to apply across all the inputs please.
Thanks!
You need to create one Bar trace per category. A simple approach is to count their occurrences for each city beforehand :
categories = ['A', 'B', 'C', 'D']
cities = ['London', 'Edinburgh', 'Cardiff']
occurrences = {city: df[city].value_counts() for city in cities}
bars = []
for cat in categories:
count = [occurrences[city].get(cat, default=0) for city in cities]
bar = go.Bar(name=cat, x=cities, y=count)
bars.append(bar)
fig = go.Figure(data=bars)
fig.update_layout(barmode='stack', title='Results By Location')
fig.show()