regexposixoracle-sqldeveloperregexp-substrposix-ere

POSIX ERE Regular expression to find repeated substring


I have a set of strings containing a minimum of 1 and a maximum of 3 values in a format like this:

123;456;789
123;123;456
123;123;123
123;456;456
123;456;123

I'm trying to write a regular expression so I can find values repeated on the same string, so if you have 123;456;789 it would return null but if you had 123;456;456 it would return 456 and for 123;456;123 return 123

I managed to write this expression:

(.*?);?([0-9]+);?(.*?)\2

It works in the sense that it returns null when there are no duplicate values but it doesn't return exactly the value I need, eg: for the string 123;456;456 it returns 123;456;456and for the string 123;123;123 it returns 123;123

What I need is to return only the value for the ([0-9]+) portion of the expression, from what I've read this would normally be done using non-capturing groups. But either I'm doing it wrong or Oracle SQL doesn't support this as if I try using the ?: syntax the result is not what I expect.

Any suggestions on how you would go about this on oracle sql? The purpose of this expression is to use it on a query.

SELECT REGEXP_SUBSTR(column, "expression") FROM DUAL;

EDIT:

Actually according to https://docs.oracle.com/cd/B12037_01/appdev.101/b10795/adfns_re.htm

Oracle Database implements regular expression support compliant with the POSIX Extended Regular Expression (ERE) specification.

Which according to https://www.regular-expressions.info/refcapture.html

Non-capturing group is not supported by POSIX ERE


Solution

  • This answer describes how to select a matching group from a regex. So using that,

    SELECT regexp_substr(column, '(\d{3}).*\1', 1, 1, NULL, 1) from dual;
    #                                                       ^ Select group 1
    

    Working demo of the regex (courtesy: OP).