pythonpandassortingkey

How to sort Pandas dataframe by column using the key argument


Assume a Pandas data frame (for the sake of simplicity, let's say with three columns). The columns are titled A, B and d.

$ import pandas as pd
$ df = pd.DataFrame([[1, 2, "a"], [1, "b", 3], ["c", 4, 6]], columns=['A', 'B', 'd'])
$ df
   A  B  d
0  1  2  a
1  1  b  3
2  c  4  6

Further assume that I wish to sort the data frame so that the columns have exactly the following order: d, A, B. The rows of the data frame shall not be rearranged in any way. The desired output is:

$ col_target_order = ['d', 'A', 'B']
$ df_desired
   d  A  B
0  a  1  2
1  3  1  b
2  6  c  4

I know that this can be done via the sort_index function of pandas. However, the following won't work, as the input list (col_target_order) is not callable:

$ df.sort_index(axis=1, key=col_target_order)

What key specification do I have to use?


Solution

  • Don't sort, just index:

    out = df[col_target_order]
    

    For the sake of the argument, you could sort_index with a crafted Series as key:

    df.sort_index(axis=1, key=pd.Series(range(len(col_target_order)), index=col_target_order).get)
    

    Or an Index indexer:

    df.sort_index(axis=1, key=pd.Index(col_target_order).get_indexer)
    

    Output:

       d  A  B
    0  a  1  2
    1  3  1  b
    2  6  c  4