I'm building a system similar to Reddit, where users "like" items. "Likes" would be used to determine ranking of items. There's also an "aging" factor, where more recent "likes" count more than ancient "likes".
All in all, it's similar to the algorithm described here.
My problem is that I need to ensure diversity of the items in the result ranking. Each item belongs to a category. Certain categories may be disproportionately popular. I don't want to have all items in the front page (or 2nd page) to belong to Category A, while items from other categories are nowhere to be found.
So are there any clever algorithm that can ensure diversity of results here -- to make sure there's a nice mix of different categories in every page?
Thanks
For each category, create a ranking of all the items in that category. Then, when you generate your feed, you can choose to combine the individual rankings in different ways. For example, you could merge the categories randomly and evenly: for each spot in the feed, pick a category randomly and take the highest-ranked item from that category that you haven't put into the feed already.