I'm having problems on sorting my header in a way that shows all the meetings that happened on that year, than move on to the next year (M being meeting):
Given the Data frame:
Meeting M1/2023 M1/2024 M1/2025 M1/2026 M2/2023 M2/2024 M2/2025 M2/2026 M3/2023 M3/2024 (...)
Date
2023-01-02 13.75 11.7500 NaN NaN 13.75 11.2500 NaN NaN 13.75 10.9375 (...)
2023-01-03 13.75 11.7500 NaN NaN 13.75 11.2500 NaN NaN 13.75 10.8750 (...)
2023-01-04 13.75 11.8125 NaN NaN 13.75 11.4375 NaN NaN 13.75 11.0000 (...)
2023-01-05 13.75 11.7500 NaN NaN 13.75 11.3750 NaN NaN 13.75 11.0000 (...)
What the order should be:
M1/2023 M2/2023 M3/2023 M4/2023 M1/2024 M2/2024 M3/2024 M4/2024 M1/2025 M2/2025 (...)
Date
(...)
At first i tried to reorder using df.sort_values(), but the result was the same. My next move was to get the headers as strings so i could edit the data and add the year in front of the Meeting number, so the sort_valeus() would work:
headers_dict = {}
for i in df.columns:
headers_dict[i] = i.split('/')[1]+'_'+i.split('/')[0]
Output:
{'M1/2023': '2023_M1',
'M1/2024': '2024_M1',
'M1/2025': '2025_M1',
'M1/2026': '2026_M1',
'M2/2023': '2023_M2',
'M2/2024': '2024_M2',
'M2/2025': '2025_M2', (...)}
This part worked, but I could not put the edited strings back in to the df (I tried using .reindex() here) without turning all my data in to NaN. What should I do? Is there a way to do this? Sorry for my bad english.
You have two issues, you need to swap the order of the chunks and you need to use natural sorting.
I would use str.split
then sort_values
with natsort
as key:
from natsort import natsort_key
idx = (df.columns.to_series().str.split('/', expand=True)
.sort_values(by=[1, 0], key=natsort_key).index
)
out = df[idx]
Output:
M1/2023 M2/2023 M3/2023 M1/2024 M2/2024 M3/2024 M1/2025 M2/2025 M1/2026 M2/2026
Date
2023-01-02 13.75 13.75 13.75 11.7500 11.2500 10.9375 NaN NaN NaN NaN
2023-01-03 13.75 13.75 13.75 11.7500 11.2500 10.8750 NaN NaN NaN NaN
2023-01-04 13.75 13.75 13.75 11.8125 11.4375 11.0000 NaN NaN NaN NaN
2023-01-05 13.75 13.75 13.75 11.7500 11.3750 11.0000 NaN NaN NaN NaN
Assuming another example:
# input
M1/2023 M1/2024 M2/2023 M2/2024 M10/2023 M10/2024
0 NaN NaN NaN NaN NaN NaN
# output
M1/2023 M2/2023 M10/2023 M1/2024 M2/2024 M10/2024
0 NaN NaN NaN NaN NaN NaN