phpsortingdigg

A Digg-like rotating homepage of popular content, how to include date as a factor?


I am building an advanced image sharing web application. As you may expect, users can upload images and others can comments on it, vote on it, and favorite it. These events will determine the popularity of the image, which I capture in a "karma" field.

Now I want to create a Digg-like homepage system, showing the most popular images. It's easy, since I already have the weighted Karma score. I just sort on that descendingly to show the 20 most valued images.

The part that is missing is time. I do not want extremely popular images to always be on the homepage. I guess an easy solution is to restrict the result set to the last 24 hours. However, I'm also thinking that in order to keep the image rotation occur throughout the day, time can be some kind of variable where its offset has an influence on the image's sorting.

Specific questions:

I'm not asking the community to build this algorithm, just looking for some advise :)


Solution

  • I would go with a function that decreases the "effective karma" of each item after a given amount of time elapses. This is a bit like Eric's method.

    Determine how often you want the "effective karma" to be decreased. Then multiply the karma by a scaling factor based on this period.

    effective karma = karma * (1 - percentage_decrease)
    

    where percentage_decrease is determined by yourfunction. For instance, you could do

    percentage_decrease = min(1, number_of_hours_since_posting / 24)
    

    to make it so the effective karma of each item decreases to 0 over 24 hours. Then use the effective karma to determine what images to show. This is a bit more of a stable solution than just subtracting the time since posting, as it scales the karma between 0 and its actual value. The min is to keep the scaling at a 0 lower bound, as once a day passes, you'll start getting values greater than 1.

    However, this doesn't take into account popularity in the strict sense. Tim's answer gives some ideas into how to take strict popularity (i.e. page views) into account.