spell-checkingphoneticsmetaphone

What is the Metaphone 3 Algorithm?


I want to code the Metaphone 3 algorithm myself. Is there a description? I know the source code is available for sale but that is not what I am looking for.


Solution

  • The link by @Bo now refers to (now defucnt) project entire source code.

    Hence here is the new link with direct link to Source code for Metaphone 3 https://searchcode.com/codesearch/view/2366000/

    by Lawrence Philips

    Metaphone 3 is designed to return an approximate phonetic key (and an alternate * approximate phonetic key when appropriate) that should be the same for English * words, and most names familiar in the United States, that are pronounced similarly. * The key value is not intended to be an exact phonetic, or even phonemic, * representation of the word. This is because a certain degree of 'fuzziness' has * proven to be useful in compensating for variations in pronunciation, as well as * misheard pronunciations. For example, although americans are not usually aware of it, * the letter 's' is normally pronounced 'z' at the end of words such as "sounds".

    The 'approximate' aspect of the encoding is implemented according to the following rules:

    * * (1) All vowels are encoded to the same value - 'A'. If the parameter encodeVowels * is set to false, only initial vowels will be encoded at all. If encodeVowels is set * to true, 'A' will be encoded at all places in the word that any vowels are normally * pronounced. 'W' as well as 'Y' are treated as vowels. Although there are differences in * the pronunciation of 'W' and 'Y' in different circumstances that lead to their being * classified as vowels under some circumstances and as consonants in others, for the purposes * of the 'fuzziness' component of the Soundex and Metaphone family of algorithms they will * be always be treated here as vowels.

    * * (2) Voiced and un-voiced consonant pairs are mapped to the same encoded value. This means that:
    * 'D' and 'T' -> 'T'
    * 'B' and 'P' -> 'P'
    * 'G' and 'K' -> 'K'
    * 'Z' and 'S' -> 'S'
    * 'V' and 'F' -> 'F'

    * * - In addition to the above voiced/unvoiced rules, 'CH' and 'SH' -> 'X', where 'X' * represents the "-SH-" and "-CH-" sounds in Metaphone 3 encoding.