When signing up for new accounts, web apps often ask for the answer to a 'security question', i.e. Dog's name, etc.
I'd like to go through our database and look for instances where users just mashed the keyboard instead of providing a legitimate answer - this is a high indicator of an abusive/fraudulent account.
"Mother's maiden name?" lakdsjflkaj
Any suggestions as to how I should go about doing this?
Note: I'm not ONLY using regular expressions on these 'security question answers'
The 'answers' can be:
Selected from a db using a few basic sql regexes
Analyzed as many times as necessary using python regexes
Compared/pruned/scored as needed
This is a technical question, not a philosophical one ;-)
Thanks!
You're probably better off analyzing n-gram distribution, similar to language detection.
This code is an example of language detection using trigrams. My guess is the keyboard smashing trigrams are pretty unique and don't appear in normal language.