I want to use DoubleMetaphone to get a phonetic encoding of a given string. For example:
import org.apache.commons.codec.language.DoubleMetaphone;
String s1 = "computer";
(new DoubleMetaphone()).doubleMetaphone(s1);
Result: Computer -> KMPT
The issue arises when I try to encode longer strings.
import org.apache.commons.codec.language.DoubleMetaphone;
String s1 = "dustinhoffmanisanactor";
(new DoubleMetaphone()).doubleMetaphone(s1);
Result: dustinhoffmanisanactor -> TSTN
Clearly it's taking the first 4 encoded characters and halting. In this case Dustin -> TSTN.
I used the Python implementation of Double Metaphone and it works as expected.
>>>from metaphone import doublemetaphone
>>>doublemetaphone("dustinhoffmanisanactor")[0]
"TSTNFMNSNKTR"
Seems I needed to set the max code length.
String s1 = "dustinhoffmanisanactor";
DoubleMetaphone dm = new DoubleMetaphone();
dm.setMaxCodeLen(100);
dm.doubleMetaphone(s1);
Which gives the expected TSTNFMNSNKTR
.