I'm trying to use Apache Mahout to create an Item-based recommender that recommends back items based off of similar items that other users also have in common.
I start by creating a DataModel and then I've tried passing it into various different ItemSimilarity
objects:
// Create data model
DataModel datamodel = new FileDataModel(new File("input.csv"));
// ItemSimilarity object
// ItemSimilarity similarity = new EuclideanDistanceSimilarity(datamodel);
// ItemSimilarity similarity = new PearsonCorrelationSimilarity(datamodel);
ItemSimilarity similarity = new CityBlockSimilarity(datamodel);
Then I pass the DataModel and ItemSimilarity into a GenericItemBasedRecommender and call the mostSimilarItems()
function and pass it into a list.
ItemBasedRecommender irecommender = new GenericItemBasedRecommender(datamodel, similarity);
List<RecommendedItem> irecommendations = irecommender.mostSimilarItems(item, amount);
The CityBlockSimilarity()
class worked great on a small data set, but as soon as I switched to a large data set it was no longer reliable.
Is there a different class I need to implement to return recommendations for an item based off of other items that users also have in common?
So it turns out the class I needed to implement was the TanimotoCoefficientSimilarity
class. Once I changed this, I was seeing the results I wanted to see.
ItemSimilarity similarity = new TanimotoCoefficientSimilarity(datamodel);
I was able to leave everything else the same and it worked great! Here is a link to the TanimotoCoefficientSimilarity class if you want to read more about it.