ruby-on-railspostgresqlrails-postgresqlunaccent

Postgres accent insensitive LIKE search in Rails 3.1 on Heroku


How can I modify a where/like condition on a search query in Rails:

find(:all, :conditions => ["lower(name) LIKE ?", "%#{search.downcase}%"])

so that the results are matched irrespective of accents? (eg métro = metro). Because I'm using utf8, I can't use "to_ascii". Production is running on Heroku.


Solution

  • Proper solution

    Since PostgreSQL 9.1 you can just:

    CREATE EXTENSION unaccent;
    

    Provides a function unaccent(), doing what you need (except for lower(), just use that additionally if needed). Read the manual about this extension.

    More about unaccent and indexes:

    Poor man's solution

    If you can't install unacccent, but are able to create a function. I compiled the list starting here and added to it over time. It is comprehensive, but hardly complete:

    CREATE OR REPLACE FUNCTION lower_unaccent(text)
      RETURNS text
      LANGUAGE sql IMMUTABLE STRICT AS
    $func$
    SELECT lower(translate($1
         , '¹²³áàâãäåāăąÀÁÂÃÄÅĀĂĄÆćčç©ĆČÇĐÐèéêёëēĕėęěÈÊËЁĒĔĖĘĚ€ğĞıìíîïìĩīĭÌÍÎÏЇÌĨĪĬłŁńňñŃŇÑòóôõöōŏőøÒÓÔÕÖŌŎŐØŒř®ŘšşșߊŞȘùúûüũūŭůÙÚÛÜŨŪŬŮýÿÝŸžżźŽŻŹ'
         , '123aaaaaaaaaaaaaaaaaaacccccccddeeeeeeeeeeeeeeeeeeeeggiiiiiiiiiiiiiiiiiillnnnnnnooooooooooooooooooorrrsssssssuuuuuuuuuuuuuuuuyyyyzzzzzz'
         ));
    $func$;
    

    Your query should work like that:

    find(:all, :conditions => ["lower_unaccent(name) LIKE ?", "%#{search.downcase}%"])
    

    For left-anchored searches, you can use an index on the function for very fast results:

    CREATE INDEX tbl_name_lower_unaccent_idx
      ON fest (lower_unaccent(name) text_pattern_ops);
    

    For queries like:

    SELECT * FROM tbl WHERE (lower_unaccent(name)) LIKE 'bob%';
    

    Or use COLLATE "C". See: