pythonregexfraud-prevention

Regex for keyboard mashing


When signing up for new accounts, web apps often ask for the answer to a 'security question', i.e. Dog's name, etc.

I'd like to go through our database and look for instances where users just mashed the keyboard instead of providing a legitimate answer - this is a high indicator of an abusive/fraudulent account.

"Mother's maiden name?" lakdsjflkaj

Any suggestions as to how I should go about doing this?

Note: I'm not ONLY using regular expressions on these 'security question answers'

The 'answers' can be:

  1. Selected from a db using a few basic sql regexes

  2. Analyzed as many times as necessary using python regexes

  3. Compared/pruned/scored as needed

This is a technical question, not a philosophical one ;-)

Thanks!


Solution

  • You're probably better off analyzing n-gram distribution, similar to language detection.

    This code is an example of language detection using trigrams. My guess is the keyboard smashing trigrams are pretty unique and don't appear in normal language.