regexsubstringdenodovql

Get part of string with regex/ sql substr


I'm tying to extract a part of my string:

output Fullstring Expected Result
xi xi-AN224432_-_this_is_some_text-x10.pdf AN224432 or AN224432_
F1800X7T2P5P xi-F1800X7T2P5P-blahblah-v01_00-EN.pdf F1800X7T2P5P
1E303S 1E303S-blahblah-v01_20-EN.pdf 1E303S

The regex I tried is:

select 
    regexp(Fullstring ,'^(xi-)?(\[A-Za-z0-9\]+)-.\*','$2') output  

This returns the second row correctly, but not the first one.

The output I need is the alphanumeric number in between or it can be in start. "xi" needs to be omiited from the result, as it is not alpha numeric number, i need alpha numeric, if its "xi" in start i want to get the second part of the string, i didn't understand why for second row it showed correct but for first one it displayed 'xi'. third is best case- as there is not xi in the start.


Solution

  • I don't know which flavour of regex you are running, as usually we would not escape the brackets. But this may be due to the syntax of your regex engine. Up to you to see if you need them or not (also for \* instead of *). The only doc I found is this page about Denodo VQL regexp() function and it mentions Java or Perl kind of syntax.

    But what is missing is the optional underscore character "_" in your pattern.

    In Perl or PHP, it would be: ^(xi-)?(([A-Za-z0-9]+)_?)-.*

    Demo here: https://regex101.com/r/DCxixu/1

    I added a capturing group so that you can decide if you want to use the full group with the underscore or if you want to drop it. You could even replace _? by _* if they could be several of them.