[SOLVED] How to get the 'number' of occurrence of a value in a column in another column in a pandas dataframe

How to get the 'number' of occurrence of a value in a column in another column in a pandas dataframe

I need to get the exact 'number' of occurrence of a particular value in a column in another column in dataframe. To make it simpler, need the cumulative count of a repetitive value in a column

Consider the following example:

col_A	col_B
test1	some1
test2	some2
test2	some3
test2	some4
test3	some5
test1	some6
test3	some7

I need a col_C that gets the exact occurence number of col_A values Consider the following:

col_A	col_B	col_C
test1	some1	1
test2	some2	1
test2	some3	2
test2	some4	3
test3	some5	1
test1	some6	2
test3	some7	2

In row1 - 'test1' appears for the 1st time, so we got '1' in col_C; in row2 - 'test2' appears for the 1st time, so we got '1' in col_C; in row3 - 'test2' appears for the 2nd time, so we got '2' in col_C and so on

Did use value_counts, but am only getting the count of col_A values in col_C. I need the exact occurrence.

Thanks for the help!

Solution

What you're looking for is cumcount() from groupby. It counts the amount of the same values in each column starting from 0.
This code should work.

s = df.groupby('col_A').cumcount()
df['col_C'] =  s+1