pythonpandasgroup-bylines-of-code

Create a column ranking order of records by date


Imagine I have the following dataframe with employees, their contract type (values could be Employee, Contractor and Agency). Also, one person can have more than 1 contract as you could see in the dataframe example below:

ID      Name    Contract    Date
10000   John    Employee    2021-01-01
10000   John    Employee    2021-01-01
10000   John    Employee    2020-03-06
10000   John    Contractor  2021-01-03
10000   John    Agency      2021-01-01
10000   John    Contractor  2021-02-01
10001   Carmen  Employee    1988-06-03
10001   Carmen  Employee    2021-02-03
10001   Carmen  Contractor  2021-02-03
10002   Peter   Contractor  2021-02-03
10003   Fred    Employee    2020-01-05
10003   Fred    Employee    1988-06-03

I need to find a way that, per each unique ID, and each unique Contract Type, it created a column named "Order" that would rank, starting with 1 on the oldest contract, each of the contract types each ID have.If the date is the same, the rank order does not matter. This would result on the following dataframe:

ID      Name    Contract    Date        Order
10000   John    Employee    2021-01-01  1
10000   John    Employee    2021-01-01  2
10000   John    Employee    2020-03-06  3
10000   John    Contractor  2021-01-03  2
10000   John    Agency      2021-01-01  1
10000   John    Contractor  2021-02-01  1
10001   Carmen  Employee    1988-06-03  1
10001   Carmen  Employee    2021-02-03  2
10001   Carmen  Contractor  2021-02-03  1
10002   Peter   Contractor  2021-02-03  1
10003   Fred    Employee    2020-01-05  2
10003   Fred    Employee    1988-06-03  1

Solution

  • Here is an approach: You first sort df using df.sort_values() and finally use df.groupby().cumcount()

    df['Data'] = pd.to_datetime(df['Date'])
    df = df.sort_values(['ID', 'Contract', 'Date'])
    df['Order'] = df.groupby(['ID', 'Contract']).cumcount() + 1
    print(df)
    

           ID    Name    Contract        Date       Data  Order
    4   10000    John      Agency  2021-01-01 2021-01-01      1
    3   10000    John  Contractor  2021-01-03 2021-01-03      1
    5   10000    John  Contractor  2021-02-01 2021-02-01      2
    2   10000    John    Employee  2020-03-06 2020-03-06      1
    0   10000    John    Employee  2021-01-01 2021-01-01      2
    1   10000    John    Employee  2021-01-01 2021-01-01      3
    8   10001  Carmen  Contractor  2021-02-03 2021-02-03      1
    6   10001  Carmen    Employee  1988-06-03 1988-06-03      1
    7   10001  Carmen    Employee  2021-02-03 2021-02-03      2
    9   10002   Peter  Contractor  2021-02-03 2021-02-03      1
    11  10003    Fred    Employee  1988-06-03 1988-06-03      1
    10  10003    Fred    Employee  2020-01-05 2020-01-05      2