oraclecognosregexp-replace

Regular expression to remove a portion of text from each entry in commas separated list


I have a string of comma separated values, that I want to trim down for display purpose.

The string is a comma separated list of values of varying lengths and number of list entries.

Each entry in the list is formatted as a five character pattern in the format "##-NX" followed by some text.

e.g., "01-NX sometext, 02-NX morertext, 09-NX othertext, 12-NX etc..."

Is there an regular expression function I can use to remove the text after the 5 character prefix portion of each entry in the list, returning "01-NX, 02-NX, 09-NX, 12-NX,..."?

I am a novice with regular expressions and I haven't been able figure out how to code the pattern.


Solution

  • I think what you need is

    regexp_replace(regexp_replace(mystring, '(\d{2}-NX)(.*?)(,)', '\1\3'), '(\d{2}.*NX).*', '\1')
    

    The inner REGEXP_REPLACE looks for a pattern like nn-NX (two numeric characters followed by "-NX") and any number of characters up to the next comma, then replaces it with the first and third term, dropping the "any number of characters" part.

    The outer REGEXP_REPLACE looks for a pattern like two numeric characters followed by any number of characters up to the last NX, and keeps that part of the string.

    Here is the Oracle code I used for testing:

    with a as (
      select '01-NX sometext, 02-NX morertext, 09-NX othertext, 12-NX etc.' as myString
      from dual
    )
    select mystring
    , regexp_replace(regexp_replace(mystring, '(\d{2}-NX)(.*?)(,)', '\1\3'), '(\d{2}.*NX).*', '\1') as output
    from a