regexoracle

Extracting with regex two conditions of some text


My code doesn't work:

 regexp_substr('Lorem ipsum dolor sit amet. consectetur', '([^(.|()]+)|((.){0,9})')

The text should end with a dot, and if it does not have a dot, then it should have a maximum of 10 characters. Is it even possible to do this?

Two examples text:

  1. Lorem ipsum dolor sit amet. consectetur
  2. Donec quis turpis sed sapien ullamcorper viverra sodales a est

This is what it should look like

  1. Lorem ipsum dolor sit amet
  2. Donec quis

Solution

  • You can use a replacing approach here:

    REGEXP_REPLACE('Lorem ipsum dolor sit amet. consectetur',
                          '^([^.]+)\..*|^(.{10}).*',
                          '\1\2') 
    

    See this regex demo.

    Details:

    The replacement is two backreferences to the captured values.