pythonstringemailautocorrect

How to correct badly written emails?


I am trying to correct badly written emails contained in a list, by searching differences in the most common domains. E.g: hotmal.com to hotmail.com. The thing is, there are tons of variations to one single domain. It would be extremly helpful if someone knew of an algorithm in python that can work as an autocorrect for email domains. Or if this is too complex of a problem for a few lines of code.


Solution

  • Check Levenshtein distance starting at https://en.wikipedia.org/wiki/Levenshtein_distance It is commonly used for auto-correct