Given the following dataframe:
Year 2024 2023 2022
Header N Result SD N Result SD N Result SD
Vendor
A 5 20 3 5 22 4 1 21 3
B 4 25 2 4 25 3 4 26 5
C 9 22 3 9 27 1 3 23 3
D 3 23 5 3 16 2 5 13 4
E 5 27 2 5 21 3 3 19 5
I would like to calculate for each year the mean value of the results column and then create a column, where the relative deviation to the mean is displayed (e.g. Results Value / mean-value * 100). The N and SD column were just included for completeness and is not needed for the calculation.
Year 2024 2023 2022
Header N Result SD Deviation N Result SD Deviation N Result SD Deviation
Vendor
A 5 20 3 85.5 5 22 4 99.1 1 21 3 ..
B 4 25 2 106 4 25 3 113 4 26 5 ..
C 9 22 3 .. 9 27 1 .. 3 23 3 ..
D 3 23 5 .. 3 16 2 .. 5 13 4 ..
E 5 27 2 .. 5 21 3 .. 3 19 5 ..
How what i be able to achieve that?
Thanks a lot in advance!
Use DataFrame.xs
for select Result
labels in MultiIndex
, divide by mean and append to original in concat
, last for correct position add DataFrame.sort_index
with parameter sort_remaining=False
:
df1 = df.xs('Result', axis=1, level=1, drop_level=False)
out = (pd.concat([df,
df1.div(df1.mean()).mul(100)
.rename(columns={'Result':'Deviation'})], axis=1)
.sort_index(axis=1, ascending=False, level=0, sort_remaining=False))
print (out)
2024 2023 2022 \
N Result SD Deviation N Result SD Deviation N Result SD
A 5 20 3 85.470085 5 22 4 99.099099 1 21 3
B 4 25 2 106.837607 4 25 3 112.612613 4 26 5
C 9 22 3 94.017094 9 27 1 121.621622 3 23 3
D 3 23 5 98.290598 3 16 2 72.072072 5 13 4
E 5 27 2 115.384615 5 21 3 94.594595 3 19 5
Deviation
A 102.941176
B 127.450980
C 112.745098
D 63.725490
E 93.137255
Another loop idea:
for x in df.columns.levels[0]:
df[(x, 'Deviation')] = df[(x, 'Result')].div(df[(x, 'Result')].mean()).mul(100)
out = df.sort_index(axis=1, ascending=False, level=0, sort_remaining=False)
print (out)
2024 2023 2022 \
N Result SD Deviation N Result SD Deviation N Result SD
A 5 20 3 85.470085 5 22 4 99.099099 1 21 3
B 4 25 2 106.837607 4 25 3 112.612613 4 26 5
C 9 22 3 94.017094 9 27 1 121.621622 3 23 3
D 3 23 5 98.290598 3 16 2 72.072072 5 13 4
E 5 27 2 115.384615 5 21 3 94.594595 3 19 5
Deviation
A 102.941176
B 127.450980
C 112.745098
D 63.725490
E 93.137255