I have a pandas dataframe like this, with user_id
, title
of the song listened by the user and the number of times that a specific user has listened to that song (listen_count
).
I'm new to python and pandas and I'm trying to build a recommender system. I want to transform these implicit feedbacks (listen_count
) into explicit ones following the (8) and (9) formulas of this paper.
listen_count
value in my dataframe), divided by the total number of plays made by the user on all songs listened by him (the total listen_count
for each user)You should be able to solve this problem by using DataFrame.groupby()
. Assuming that your dataframe is called df
, you can try the following(it's hard for me to check if it produces the right result without the data).
# get the total listen count for each user_id
df['total_listen_count_per_user'] = df.groupby('user_id')['listen_count'].transform('sum')
# get the song frequency by dividing the sum of song_listen_counts per song by
# the total_listen_count for each user
df['song_frequency']=df.groupby('title')['listen_count'].transform('sum')/df['total_listen_count_per_user']
Here is the reference for DataFrame.transform and DataFrame.groupby