pythonpandassorting

Python Two Custom Sort By Two String Variables


I have a dataframe:

data = {
    'group': ['2', '1', '2', '2', '2', '1'],
    'interval': ['20-30', '20-30', '30-40', '10-20', '10-20', '0-10'],
    'count': [3, 4, 2, 7, 5, 1],
}
df = pd.DataFrame(data)
group   interval    count
2          20-30    3
1          20-30    4
2          30-40    2
2          10-20    7
2          10-20    5
1          0-10     1

I want to sort group first then interval simultaneously in ascending order which will look like this:

group   interval    count
1       00-10   1
1       20-30   4
2       10-20   5
2       10-20   7
2       20-30   3
2       30-40   2

I know how to do this separately but how to do it simultaneously?

(
    df
    .sort_values(by = ['group'], key = lambda s: s.str[0:].astype(int))
    .sort_values(by = ['interval'], key = lambda s: s.str[:2].astype(int))
    .reset_index(drop=True)
)

Solution

  • If you want to use different custom keys in the same sort_values call, you could use a dictionary:

    keys = {'group': lambda s: s.str[0:].astype(int),
            'interval': lambda s: s.str[:2].astype(int)
           }
    
    df.sort_values(by=['group', 'interval'],
                   key=lambda x: keys[x.name](x),
                   ignore_index=True)
    

    Or with a custom function:

    def cust_sort(s):
        match s.name:
            case 'group':
                return s.str[0:].astype(int)
            case 'interval':
                return s.str[:2].astype(int)
            case _:
                return s
    
    df.sort_values(by=['group', 'interval'],
                   key=cust_sort,
                   ignore_index=True)
    

    Output:

      group interval  count
    0     1    00-10      1
    1     1    20-30      4
    2     2    10-20      7
    3     2    10-20      5
    4     2    20-30      3
    5     2    30-40      2