pandas

incremental counter to a dataframe based on unique values from a column


Existing Dataframe :

Unique_Id          sms_sent_date         
A                    11-01-2022                   
A                    11-01-2022                   
A                    12-01-2022                   
A                    13-01-2022                   

B                    02-02-2022                   
B                    03-02-2022                   
B                    06-02-2022                   
B                    06-02-2022                   

Expected Dataframe :

Unique_Id          sms_sent_date         unique_sms_counter
A                    11-01-2022                   1
A                    11-01-2022                   1
A                    12-01-2022                   2
A                    13-01-2022                   3

B                    02-02-2022                   1
B                    03-02-2022                   2
B                    06-02-2022                   3
B                    06-02-2022                   3

I trying to add the smsq_counter which shall count the unique sms sent to a unique Id. tried this df.groupby(['Unique_Id','sms_sent_date']).cumcount() but stuck with applying it on entire dataframe


Solution

  • You can use:

    df['unique_sms_counter'] = (
      # get non duplicates
     (~df.duplicated(['Unique_Id', 'sms_sent_date']))
      # increment per group
     .groupby(df['Unique_Id']).cumsum()
    )
    

    output:

      Unique_Id sms_sent_date  unique_sms_counter
    0         A    11-01-2022                   1
    1         A    11-01-2022                   1
    2         A    12-01-2022                   2
    3         A    13-01-2022                   3
    4         B    02-02-2022                   1
    5         B    03-02-2022                   2
    6         B    06-02-2022                   3
    7         B    06-02-2022                   3