sqlregexpostgresqlpattern-matching

Regular expression in PostgreSQL LIKE clause


I'm stuck with a simple regular expression. Not sure what I'm missing. A little rusty on regex skills.

The expression I'm trying to match is:

select * from table where value like '00[1-9]%'
-- (third character should not be 0)

So this should match '0090D0DF143A' (format: text) but it's NOT!


Solution

  • Bracket expressions only work for the regular expression operator ~.
    And "third character should not be 0" translates to:

    WHERE  value ~ '^00[^0]'
    

    ^ ... match at start of string (your original expression could match at any position).
    [^0] ... a bracket expression (character class) matching any character that is not 0.

    Or better, yet:

    WHERE  value LIKE '00%'       -- starting with '00'
    AND    value NOT LIKE '000%'  -- third character is not '0'
    

    LIKE is not as versatile, but typically faster than regular expressions. It's probably faster to narrow down the set of candidates with a cheap LIKE expression.

    Generally, you would use NOT LIKE '__0', but since we already establish LIKE '00%' in the other predicate, we can use the narrower (cheaper) pattern NOT LIKE '000'.

    With big tables, index support is the key to performance. Postgres can use a plain B-tree index for left-anchored expressions (like `value LIKE '00%'). In modern Postgres for simple regular expressions, too. See:

    Postgres 11 added the "starts with" operator ^@, which is the king of micro-optimization now:

    WHERE  value ^@ '00'     -- starting with '00'
    AND    value !~~ '000%'  -- third character is not '0'
    

    !~~ being the Postgres operator to implement the SQL construct NOT LIKE. So that's just optional flexing.
    But ^@ can make an actual difference. For starters, you can use the bare string without concatenating wildcards as pattern. See: