javaregex

Using regular expressions for extracting messages from a source code


Currently I'm trying to write a java program to extract messages from a plsql pkg file.

General message format of a pkg would be,

Type 01;

Error_Msg.General_Message(pkg_name_,'INVALIDVALUE: The value 1,2 and 3 that you have entered is invalid.');

But in some cases a message could be as follows,

Type 02:

Error_Msg.General_Message(pkg_name_,'INVALIDVALUE: The value :p you have entered is invalid.', Some_Pkg.Some_Function(parameter1, parameter2) );

NOTE: :p is a bind variable

Sometimes messages could be concatenated using '||' in plslql,

EX:

Error_Msg.General_Message(pkg_name_, 'This is a multiline'||'
     message');

I need to extract the only the text message, For an example in Type 01 the text I'm looking for is

'INVALIDVALUE: The value 1,2 and 3 that you have entered is invalid.'

I tried with this pattern,

\\s*(\\w+):\\s*[,-:\\w*\\s*\"\\.\\|\\'\\(\\)\\\\]+

But this would return a wrong result in the second message type.

Could somebody help me with this??

Thanks!


Solution

  • Maybe you could try something like this?

    \\s*(\\w+):\\s*(?:'\\s*\\|\\|\\s*'|[^'])+'
    

    regex101 demo

    '\\s*\\|\\|\\s*' will match the multiline part and allow the regex to continue matching.