I have a table with names, year and a score/count:
("Andi", 2020, 40),
("Andi", 2021, 60),
("Andi", 2022, 55),
("Andi", 2023, 55),
("Alex", 2020, 14),
("Alex", 2021, 1),
("Alex", 2022, 13),
("Alex", 2023, 13),
...
I try to zip that into an Excel file with writefile() etc. And it should look like this
NAME | 2020 | 2021 | 2022 | 2023 | TOTAL |
---|---|---|---|---|---|
Andi | 40 | 60 | 55 | 55 | 210 |
Alex | 14 | 1 | 13 | 13 | 41 |
And ofc next year they will be new records added: ("Andi",2024,49), ("Alex",2024,10),... So the sqlstament should be also add that when I recreate the Excel file again next year. I have found this: How to select Column value as Column name with conditions in SQL table which is hardcoded and works how I would want it but I dont want to change it every year. My Database also starts at 2016, so that would be a very long statment. I tried to make for loops for select, from, on,... but I think there could be an easier way?
Any ideas how I can achieve that? Thank you
You can use a SQL query like below to pivot the columns names as year.
SELECT name,
MAX(CASE WHEN year = 2020 THEN score END) AS '2020',
MAX(CASE WHEN year = 2021 THEN score END) AS '2021',
MAX(CASE WHEN year = 2022 THEN score END) AS '2022',
MAX(CASE WHEN year = 2023 THEN score END) AS '2023',
MAX(CASE WHEN year = 2024 THEN score END) AS '2024',
COALESCE(MAX(CASE WHEN year = 2020 THEN score END), 0)+COALESCE(MAX(CASE WHEN year = 2021 THEN score END), 0)+COALESCE(MAX(CASE WHEN year = 2022 THEN score END), 0)+COALESCE(MAX(CASE WHEN year = 2023 THEN score END), 0)+COALESCE(MAX(CASE WHEN year = 2024 THEN score END), 0) AS total
FROM scores
GROUP BY name
ORDER BY name;
This can be dynamically generated as when new years get added in.
# Get all the years
cursor.execute("SELECT DISTINCT year FROM scores ORDER BY year")
years = cursor.fetchall()
# Generate the pivot columns as years
columns = []
for year in years:
columns.append(f"MAX(CASE WHEN year = {year[0]} THEN score END) AS '{year[0]}'")
columns_str = ", ".join(columns)
sql = f"""
SELECT name, {columns_str},
{"+".join([f"COALESCE(MAX(CASE WHEN year = {year[0]} THEN score END), 0)" for year in years])} AS total
FROM scores
GROUP BY name
ORDER BY name;
"""
cursor.execute(sql)
result = cursor.fetchall()
df = pd.DataFrame.from_records(result, columns=['name'] + [str(year[0]) for year in years] + ['total'])
print(df)
Output
name 2020 2021 2022 2023 2024 total
0 Alex 14 1 13 13 10 51
1 Andi 40 60 55 55 49 259