I have a dataframe:
data = {
'group': ['2', '1', '2', '2', '2', '1'],
'interval': ['20-30', '20-30', '30-40', '10-20', '10-20', '0-10'],
'count': [3, 4, 2, 7, 5, 1],
}
df = pd.DataFrame(data)
group interval count
2 20-30 3
1 20-30 4
2 30-40 2
2 10-20 7
2 10-20 5
1 0-10 1
I want to sort group
first then interval
simultaneously in ascending order which will look like this:
group interval count
1 00-10 1
1 20-30 4
2 10-20 5
2 10-20 7
2 20-30 3
2 30-40 2
I know how to do this separately but how to do it simultaneously?
(
df
.sort_values(by = ['group'], key = lambda s: s.str[0:].astype(int))
.sort_values(by = ['interval'], key = lambda s: s.str[:2].astype(int))
.reset_index(drop=True)
)
If you want to use different custom keys in the same sort_values
call, you could use a dictionary:
keys = {'group': lambda s: s.str[0:].astype(int),
'interval': lambda s: s.str[:2].astype(int)
}
df.sort_values(by=['group', 'interval'],
key=lambda x: keys[x.name](x),
ignore_index=True)
Or with a custom function:
def cust_sort(s):
match s.name:
case 'group':
return s.str[0:].astype(int)
case 'interval':
return s.str[:2].astype(int)
case _:
return s
df.sort_values(by=['group', 'interval'],
key=cust_sort,
ignore_index=True)
Output:
group interval count
0 1 00-10 1
1 1 20-30 4
2 2 10-20 7
3 2 10-20 5
4 2 20-30 3
5 2 30-40 2