sqlpostgresqlindexingpattern-matchingunaccent

Index for ILIKE + unaccent()


I have a lot of queries using

...
WHERE company_id = 2
AND unaccent(name) ILIKE unaccent('value');

Is possible to create a index to optimize this type of query? I take a look at GIN and pg_trgm but what about the unaccent() part?


Solution

  • The tricky part is that unaccent() is only STABLE, but indexes demand IMMUTABLE expressions. You can work around this by creating an immutable variant.

    This code-path requires the privilege to create C functions. And the second part uses SQL-standard syntax that requires Postgres 14 or later:

    CREATE OR REPLACE FUNCTION public.immutable_unaccent(regdictionary, text)
      RETURNS text
      LANGUAGE c IMMUTABLE PARALLEL SAFE STRICT AS
    '$libdir/unaccent', 'unaccent_dict';
    
    CREATE OR REPLACE FUNCTION public.f_unaccent(text)
      RETURNS text
      LANGUAGE sql IMMUTABLE PARALLEL SAFE STRICT
    RETURN public.immutable_unaccent(regdictionary 'public.unaccent', $1);
    

    Then create an expression index with the IMMUTABLE function, and use it in queries:

    CREATE INDEX tbl_unaccent_name_idx ON tbl (public.f_unaccent(name));
    
    SELECT * FROM tbl
    WHERE  f_unaccent(name) = f_unaccent('value');
    

    Or a GIN index using the additional module pg_trgm:

    CREATE INDEX tbl_unaccent_name_trgm_idx ON tbl USING gin (public.f_unaccent(name) gin_trgm_ops);
    
    SELECT * FROM tbl
    WHERE  f_unaccent(name) ~* f_unaccent('value');
    

    There is a lot of fine print to this. Detailed instructions (incl. variants for limited privileges and older versions):

    About pattern matching operators LIKE, ILIKE, ~, ~* & friends:

    And you'll want to keep an eye on special characters in the search pattern. See: