[SOLVED] Filling a column with the amount of duplicated values in another column

Filling a column with the amount of duplicated values in another column

I have a df like this:

month	outcome	mom.ret
10/20	winner	0.2
10/20	winner	0.9
11/20	winner	0.6
11/20	winner	0.2
11/20	winner	0.9
10/20	loser	0.6
10/20	loser	0.2
10/20	loser	0.9
11/20	loser	0.6

I would like to add another column, which has 1 / by the counts of times the value "winner" or "loser" appears per each month on the column outcome. The expected output for the example df is:

month	outcome	mom.ret	q
10/20	winner	0.2	1/2
10/20	winner	0.9	1/2
11/20	winner	0.6	1/3
11/20	winner	0.2	1/3
11/20	winner	0.9	1/3
10/20	loser	0.6	1/3
10/20	loser	0.2	1/3
10/20	loser	0.9	1/3
11/20	loser	0.6	1/1

I thought of using the function count to count how many times the values are repeated, but then I need to specify that the "count" should be done per each date. Any ideas?

Solution

You can use this code to achieve what you want, assuming your original DataFrame is called df:

counts = df.groupby(['month', 'outcome'], as_index=False).count()
counts = counts.rename(columns={'mom.ret': 'q'})
# Use this line if you want the float value of the division 0.5
# counts['q'] = 1/counts['q']
# Use this line if you want the string '1/2'
counts['q'] = counts['q'].apply(lambda x: f'1/{x}')
result = pd.merge(df, counts)

The result looks like this:

month   outcome mom.ret q
0   10/20   winner  0.2 1/2
1   10/20   winner  0.9 1/2
2   11/20   winner  0.6 1/3
3   11/20   winner  0.2 1/3
4   11/20   winner  0.9 1/3
5   10/20   loser   0.6 1/2
6   10/20   loser   0.2 1/2
7   11/20   loser   0.9 1/2
8   11/20   loser   0.6 1/2