phpmysqllucenefull-text-searchsphinx

Best way to deal with misspellings in a MySQL fulltext search


I have about 2000 rows in a mysql database.

Each row is a max of 300 characters and contains a sentence or two.

I use mysql's built in fulltext search to search these rows.

I would like to add a feature so that typos and accidental mispellings are corrected, if possible.

For example, if someone types "right shlder" into the searchbox, this would equate to "right shoulder" when performing the search.

What are your suggestions on the simplest way to add this kind of functionality? Is it worth adding an external search engine of some kind, like lucene? (It seems like for such a small dataset, this is overkill.) Or is there a simpler way?


Solution

  • I think you should use SOUNDS LIKE or SOUNDEX()

    As your data set is so small, one solution may be to create a new table to store the individual words or soundex values contained in each text field and use SOUNDS LIKE on that table.

    e.g:

    SELECT * FROM table where id IN 
    (
        SELECT refid FROM tableofwords 
        WHERE column SOUNDS LIKE 'right' OR column SOUNDS LIKE 'shlder'
    )
    

    see: http://dev.mysql.com/doc/refman/5.0/en/string-functions.html

    I belive it is not possible to wild card seach the string :(