pythonmachine-learningpysparkcollaborative-filteringals

Find closest item from ALS model using KNN


I have a dataset like:

cid_int   item_id   score
  1         678      0.5
  2         787      0.6
  3         908      0.1
  .          .        .
  .          .        .

Now I'm running ALS model on this pyspark dataframe for getting recommendation using Collaborative Filtering.

als = ALS(userCol= "cid_int", itemCol= "item_id", ratingCol= "score", rank=5, maxIter=10, seed=0)
model = als.fit(X_train)

Now I have question that what does output of model.userFactors returns, does it return item embeddings like for m items I'll get all the embeddings?

And if yes can I use KNN on these embedding to find the closest items to given item?


Solution

  • Yes, model.userFactors will return embeddings for users. In your case, these will be vectors of dimension 5.

    Yes, you can use these embeddings for KNN model. If the KNN model will perform poorly, try to increase the rank value - this will increase the dimension of the vectors.