pythonpandassortinggroup-byyearmonth

Sorting Months inside a Multi index groupby object


Its the sample input. i wanted to group according to the Year column,and wanted to use value counts on month column, then to sort the 'month' column according to the month order.

Year month
2000 Oct
2002 Jan
2002 Mar
2000 Oct
2002 Mar
2000 Jan

I did this: df.groupby(['Year'])['month'].value_counts() i got the following output:

year month
2000 Oct 2
Jan 1
2002 Mar 2
Jan 1

now i need to sort the month in the original month order.what can i do? i want the following output:

year month
2000 Jan 1
Oct 2
2002 Jan 1
Mar 2

Solution

  • You can use groupby() and sort_values(by=['Year', 'month']):

    import pandas as pd
    
    
    def _sort_month(df):
        month_order = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
        df['month'] = pd.Categorical(df['month'], categories=month_order, ordered=True)
        GB = df.groupby(['Year'])['month'].value_counts()
        G = GB.reset_index(name='count')
        res = G.sort_values(by=['Year', 'month'])
        res.set_index(['Year', 'month'], inplace=True)
        return res[res['count'] > 0]
    
    
    df = pd.DataFrame({'Year': [2000, 2002, 2002, 2000, 2002, 2000],
                       'month': ['Oct', 'Jan', 'Mar', 'Oct', 'Mar', 'Jan']})
    
    print(_sort_month(df))
    
    
    

    Prints

                count
    Year month       
    2000 Jan        1
         Oct        2
    2002 Jan        1
         Mar        2