pandaspandas-groupbysklearn-pandasexpandablelistadapterpandas-datareader

pandas in same table (same dataframe), how to group different rows with new name and with sum of other row values


below dataframe is the output of below code i want to group rows further
train=pd.read_excel("monthly_report.xlsx", sheet_name="xy12",sep=r'\s*,\s*')
train['Date/Time Opened']=train['Date/Time Opened'].dt.month_name()
train=train.groupby(['col1', 'Date/Time Opened'])['Date/Time Opened'].count()

col1         Date/Time Opened    number
abc          April               40
             August              30
             December            25
             February            30
             January             45

xyz          April                1
             August               1
             November             3
             October              2
             September            3
pqr          March                2
             May                  4
             November             5
             October              2

now i want above format to be something like below. Thereafter, based on this i want to build the graph

abcxyz(new name)  April               41
                  August              31
                  December            25
                  February            30
                  January             45
                  September            3
                  November             3
                  October              2

pqr(new name)           
                 March                2
                 May                  4
                 November             5
                 October              2

Can someone please let me know how i can concatenate the rows with diffrenet values in new row and sum of rest other row values


Solution

  • You can use Series.mask with Series.isin for set same catagory:

    train['col1'] = train['col1'].mask(train['col1'].isin(['abc','xyz']), 'abcxyz')
    

    Or use Series.replace with dictionary:

    train['col1'] = train['col1'].replace({'abc':'abcxyz','xyz':'abcxyz'})
    

    ... and then use your solution:

    train['Date/Time Opened']=train['Date/Time Opened'].dt.month_name()
    train=train.groupby(['col1', 'Date/Time Opened'])['Date/Time Opened'].count()