algorithmsecuritytheoryuser-generated-content

User-Generated Content View Validation


I am developing a user-generated content site. The goal is that users are rewarded if their content is viewed by a certain number of people. Whereas a user account is required to post content, an account is not required to view content.

I am currently developing the algorithm to count the number of valid views, and I am concerned about the possibility that users create bots to falsely increase their number of views. I would exclude views from the content generator’s IP, but I do not want to exclude valid views from other users with the same external IP address. The same external IP address could in fact account for a large amount of valid views in a college campus or corporate setting.

The site is implemented in python, and hosted on apache servers. The question is more theoretical in nature, as how can I establish whether or not traffic from the same IP is legitimate or not. I can’t find any content management systems that do this, and was just going to implement it myself.


Solution

  • You cannot reliably do this. Any method you create can be automated.

    That said, you can raise the bar. For instance every page viewed can have a random number encoded into a piece of JavaScript that will submit an AJAX request. Any view where you have that corresponding AJAX request is probably a real browser, and is likely to be a real human since few bots handle JavaScript correctly. But absolutely nothing stops someone from having an automatic script to drive a real browser.