regexoracle-databaseposix-ere

RegExp for a comma-separated list of terms that finds terms not ending or starting with a character


I have a String containing a comma-separated list of terms, like this:

term1, term2* ,term3*,term 4 , *term5, *term 6 , term 7*,* term 8

Each term may have leading or trailing whitespaces. These should be ignored. Each term may have whitespaces inside it. I want to find all terms not starting or ending in an asterisk. In the list above, that would be "term1" and "term 4".

My failed attempts only led to me finding every term (just dropping the asterisks rather than ignoring the term) like in this example: https://regex101.com/r/9QjjJ5/1.

I've also tried achieving this with lookahead expressions and borders, but must be using them wrongly, as the found term is then just shortened or excluding spaces inside the term.


Solution

  • You can use the conventional method in Oracle to split the string and then use REGEXP_LIKE to filter.

    WITH tab ( terms ) AS (
        SELECT
            'term1, term2* ,term3*,term 4 , *term5, *term 6 , term 7*,* term 8'
        FROM
            dual
    ) SELECT * FROM
    (
    SELECT DISTINCT
        TRIM(regexp_substr(terms,'[^,]+',1,level) ) term 
      FROM
        tab
    CONNECT BY
        regexp_substr(terms,'[^,]+',1,level) IS NOT NULL
        ) WHERE NOT REGEXP_LIKE (term,'^\*|\*$');