I have a df like this:
month | outcome | mom.ret |
---|---|---|
10/20 | winner | 0.2 |
10/20 | winner | 0.9 |
11/20 | winner | 0.6 |
11/20 | winner | 0.2 |
11/20 | winner | 0.9 |
10/20 | loser | 0.6 |
10/20 | loser | 0.2 |
10/20 | loser | 0.9 |
11/20 | loser | 0.6 |
I would like to add another column, which has 1 / by the counts of times the value "winner" or "loser" appears per each month on the column outcome. The expected output for the example df is:
month | outcome | mom.ret | q |
---|---|---|---|
10/20 | winner | 0.2 | 1/2 |
10/20 | winner | 0.9 | 1/2 |
11/20 | winner | 0.6 | 1/3 |
11/20 | winner | 0.2 | 1/3 |
11/20 | winner | 0.9 | 1/3 |
10/20 | loser | 0.6 | 1/3 |
10/20 | loser | 0.2 | 1/3 |
10/20 | loser | 0.9 | 1/3 |
11/20 | loser | 0.6 | 1/1 |
I thought of using the function count to count how many times the values are repeated, but then I need to specify that the "count" should be done per each date. Any ideas?
You can use this code to achieve what you want, assuming your original DataFrame is called df
:
counts = df.groupby(['month', 'outcome'], as_index=False).count()
counts = counts.rename(columns={'mom.ret': 'q'})
# Use this line if you want the float value of the division 0.5
# counts['q'] = 1/counts['q']
# Use this line if you want the string '1/2'
counts['q'] = counts['q'].apply(lambda x: f'1/{x}')
result = pd.merge(df, counts)
The result looks like this:
month outcome mom.ret q
0 10/20 winner 0.2 1/2
1 10/20 winner 0.9 1/2
2 11/20 winner 0.6 1/3
3 11/20 winner 0.2 1/3
4 11/20 winner 0.9 1/3
5 10/20 loser 0.6 1/2
6 10/20 loser 0.2 1/2
7 11/20 loser 0.9 1/2
8 11/20 loser 0.6 1/2