Its the sample input. i wanted to group according to the Year column,and wanted to use value counts on month column, then to sort the 'month' column according to the month order.
Year | month |
---|---|
2000 | Oct |
2002 | Jan |
2002 | Mar |
2000 | Oct |
2002 | Mar |
2000 | Jan |
I did this:
df.groupby(['Year'])['month'].value_counts()
i got the following output:
year | month |
---|---|
2000 | Oct 2 |
Jan 1 | |
2002 | Mar 2 |
Jan 1 |
now i need to sort the month in the original month order.what can i do? i want the following output:
year | month |
---|---|
2000 | Jan 1 |
Oct 2 | |
2002 | Jan 1 |
Mar 2 |
You can use groupby() and sort_values(by=['Year', 'month'])
:
import pandas as pd
def _sort_month(df):
month_order = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
df['month'] = pd.Categorical(df['month'], categories=month_order, ordered=True)
GB = df.groupby(['Year'])['month'].value_counts()
G = GB.reset_index(name='count')
res = G.sort_values(by=['Year', 'month'])
res.set_index(['Year', 'month'], inplace=True)
return res[res['count'] > 0]
df = pd.DataFrame({'Year': [2000, 2002, 2002, 2000, 2002, 2000],
'month': ['Oct', 'Jan', 'Mar', 'Oct', 'Mar', 'Jan']})
print(_sort_month(df))
count
Year month
2000 Jan 1
Oct 2
2002 Jan 1
Mar 2