pythonpython-3.xpandasrecommendation-enginecollaborative-filtering

How to apply a function on pandas dataframe column


I have a pandas dataframe like this, with user_id, title of the song listened by the user and the number of times that a specific user has listened to that song (listen_count).

enter image description here

Goal to achieve:

I'm new to python and pandas and I'm trying to build a recommender system. I want to transform these implicit feedbacks (listen_count) into explicit ones following the (8) and (9) formulas of this paper.


Solution

  • You should be able to solve this problem by using DataFrame.groupby(). Assuming that your dataframe is called df, you can try the following(it's hard for me to check if it produces the right result without the data).

    # get the total listen count for each user_id
    df['total_listen_count_per_user'] = df.groupby('user_id')['listen_count'].transform('sum')
    # get the song frequency by dividing the sum of song_listen_counts per song by
    # the total_listen_count for each user
    df['song_frequency']=df.groupby('title')['listen_count'].transform('sum')/df['total_listen_count_per_user']
    

    Here is the reference for DataFrame.transform and DataFrame.groupby