pythonpandasdataframefind-occurrences

How to get the 'number' of occurrence of a value in a column in another column in a pandas dataframe


I need to get the exact 'number' of occurrence of a particular value in a column in another column in dataframe. To make it simpler, need the cumulative count of a repetitive value in a column

Consider the following example:

col_A col_B
test1 some1
test2 some2
test2 some3
test2 some4
test3 some5
test1 some6
test3 some7

I need a col_C that gets the exact occurence number of col_A values Consider the following:

col_A col_B col_C
test1 some1 1
test2 some2 1
test2 some3 2
test2 some4 3
test3 some5 1
test1 some6 2
test3 some7 2

In row1 - 'test1' appears for the 1st time, so we got '1' in col_C; in row2 - 'test2' appears for the 1st time, so we got '1' in col_C; in row3 - 'test2' appears for the 2nd time, so we got '2' in col_C and so on

Did use value_counts, but am only getting the count of col_A values in col_C. I need the exact occurrence.

Thanks for the help!


Solution

  • What you're looking for is cumcount() from groupby. It counts the amount of the same values in each column starting from 0.
    This code should work.

    s = df.groupby('col_A').cumcount()
    df['col_C'] =  s+1