I have a dataset like:
cid_int item_id score
1 678 0.5
2 787 0.6
3 908 0.1
. . .
. . .
Now I'm running ALS model on this pyspark dataframe for getting recommendation using Collaborative Filtering.
als = ALS(userCol= "cid_int", itemCol= "item_id", ratingCol= "score", rank=5, maxIter=10, seed=0)
model = als.fit(X_train)
Now I have question that what does output of model.userFactors
returns, does it return item embeddings like for m items I'll get all the embeddings?
And if yes can I use KNN
on these embedding to find the closest items to given item?
Yes, model.userFactors
will return embeddings for users. In your case, these will be vectors of dimension 5.
Yes, you can use these embeddings for KNN
model. If the KNN model will perform poorly, try to increase the rank
value - this will increase the dimension of the vectors.