javastringcomparisonlevenshtein-distancestring-metric

How to compare almost similar Strings in Java? (String distance measure)


I would like to compare two strings and get some score how much these look alike. For example "The sentence is almost similar" and "The sentence is similar".

I'm not familiar with existing methods in Java, but for PHP I know the levenshtein function.

Are there better methods in Java?


Solution

  • The Levensthein distance is a measure for how similar strings are. Or, more precisely, how many alterations have to be made that they are the same.

    The algorithm is available in pseudo-code on Wikipedia. Converting that to Java shouldn't be much of a problem, but it's not built-in into the base class library.

    Wikipedia has some more algorithms that measure similarity of strings.