mathstatisticscommentsvotinguser-generated-content

How should I order these "helpful" scores?


Under the user generated posts on my site, I have an Amazon-like rating system:

   Was this review helpful to you: Yes | No

If there are votes, I display the results above that line like so:

   5 of 8 people found this reply helpful.

I would like to sort the posts based upon these rankings. If you were ranking from most helpful to least helpful, how would you order the following posts?

   a) 1/1 = 100% helpful
   b) 2/2 = 100% helpful
   c) 999/1000 = 99.9% helpful
   b) 3/4 = 75% helpful
   e) 299/400 = 74.8% helpful

Clearly, its not right to sort just on the percent helpful, somehow the total votes should be factored in. Is there a standard way of doing this?

UPDATE:

Using Charles' formulas to calculate the Agresti-Coull lower range and sorting on it, this is how the above examples would sort:

   1) 999/1000 (99.9%) = 95% likely to fall in 'helpfulness' range of 99.2% to 100%
   2) 299/400 (74.8%) = 95% likely to fall in 'helpfulness' range of 69.6% to 79.3%
   3) 3/4 (75%) = 95% likely to fall in 'helpfulness' range of 24.7% to 97.5%
   4) 2/2 (100%) = 95% likely to fall in 'helpfulness' range of 23.7% to 100%
   5) 1/1 (100%) = 95% likely to fall in 'helpfulness' range of 13.3% to 100%

Intuitively, this feels right.

UPDATE 2:

From an application point of view, I don't want to be running these calculations every time I pull up a list of posts. I'm thinking I'll either update and store the Agresti-Coull lower bound either on a regular, cron-driven schedule (updating only those posts which have received a vote since the last run) or update it whenever a new vote is received.


Solution

  • For each post, generate bounds on how helpful you expect it to be. I prefer to use the Agresti-Coull interval. Pseudocode:

    float AgrestiCoullLower(int n, int k) {
      //float conf = 0.05;  // 95% confidence interval
      float kappa = 2.24140273; // In general, kappa = ierfc(conf/2)*sqrt(2)
      float kest=k+kappa^2/2;
      float nest=n+kappa^2;
      float pest=kest/nest;
      float radius=kappa*sqrt(pest*(1-pest)/nest);
      return max(0,pest-radius); // Lower bound
      // Upper bound is min(1,pest+radius)
    }
    

    Then take the lower end of the estimate and sort on this. So the 2/2 is (by Agresti-Coull) 95% likely to fall in the 'helpfulness' range 23.7% to 100%, so it sorts below the 999/1000 which has range 99.2% to 100% (since .237 < .992).

    Edit: Since some people seem to have found this helpful (ha ha), let me note that the algorithm can be tweaked based on how confident/risk-averse you want to be. The less confidence you need, the more willing you will be to abandon the 'proven' (high-vote) reviews for the untested but high-scoring reviews. A 90% confidence interval gives kappa = 1.95996398, an 85% confidence interval gives 1.78046434, a 75% confidence interval gives 1.53412054, and the all-caution-to-the-wind 50% confidence interval gives 1.15034938.

    The 50% confidence interval gives

    1) 999/1000 (99.7%) = 50% likely to fall in 'helpfulness' range of 99.7% to 100%
    2) 299/400 (72.2%) = 50% likely to fall in 'helpfulness' range of 72.2% to 77.2%
    3) 2/2 (54.9%) = 50% likely to fall in 'helpfulness' range of 54.9% to 100%
    4) 3/4 (45.7%) = 50% likely to fall in 'helpfulness' range of 45.7% to 91.9%
    5) 1/1 (37.5%) = 50% likely to fall in 'helpfulness' range of 37.5% to 100%
    

    which isn't that different overall, but it does prefer the 2/2 to the safety of the 3/4.