sqlregexoracleescaping

How to tell oracle SQL the backslash in regex is no escape symbol


The query select regexp_replace('Wd+Wd4Wd', '[\W\d]+', '_') from dual outputs _+_4_, meaning it treats the backslash as escape character and replaces W and d instead of + and 4. It works if \W and \d are not used in brackets: select regexp_replace('Wd+Wd4Wd', '(\W|\d)+', '_') from dual outputs Wd_Wd_Wd as expected.

How can I match non-word characters and digits within brackets?


Solution

  • This is a known issue that Oracle regex does not allow Perl-like shorthand character classes inside bracket expressions, see Oracle regex expression evaluates to false, for example.

    You can use POSIX character classes inside bracket expressions. For this concrete scenario, you can use

    select regexp_replace('Wd+Wd4Wd', '[[:digit:][:punct:][:space:]]+', '_') from dual
    

    To match digits, [:digit:] or 0-9 can work.

    As for non-word characters, I think you just want to match punctuation and whitespace, so [:punct:] and [:space:] POSIX character classes should be enough.