regexregex-recursion

Regex for matching text followed by text until closed bracket except matching opening bracket


Sorry if question is confusion. From below text

value = ( select max( tbl.column_name ) from table tbl where trim(colum2)='value1'
and (trim((colum3)))='value3')

I want output below

select max( tbl.column_name ) from table tbl where trim(colum2)='value1'
    and (trim((colum3)))='value3'

basically starting from select max( till ) except ) is matching a opening bracket.

(select[ ]*max[ ]*\(.*column_name[ ]*\)[^)]*)

This matches only till trim(colum2 Need help to escape bracket ) is there is a opening bracket including any nested

Thanks

Edit: I finally did it java as shown below. But would like to know REGEX solution

String sql = readFile(file.getPath());
Pattern patTabs = Pattern.compile("(\\([ \n]*SELECT[ \n]*MAX[ \n]*\\(.*COLUMN_NAME[ \n]*\\))", Pattern.CASE_INSENSITIVE);
Matcher tabMat = patTabs.matcher(sql);
while (tabMat.find()) {
    int i = tabMat.start();
    int j = tabMat.end();
    int l = 0;
    for (j = tabMat.end(); j < sql.length(); j++) {
        char k = sql.charAt(j);
        if (k == '(') {
            l++;
        }
        if (k == ')') {
            if (l == 0) {
                break;
            } else {
                l--;
            }
        }
    }
    System.out.println(sql.substring(i, j + 1))
}

Solution

  • You'll have to replace the line terminators with a whitespace in those expressions if you want a elegant solution. So you'll need to change this-

    value = ( select max( tbl.column_name ) from table tbl where trim(colum2)='value1'
    and (trim((colum3)))='value3')
    

    to this-

    value = ( select max( tbl.column_name ) from table tbl where trim(colum2)='value1' and (trim((colum3)))='value3')
    

    Edited: Afterwards, you can use \( *?(select *?max\( .*?column_name *?\).+) *?\) to match the expression you expect to.

    Here's a live demo

    If you REALLY don't want to replace the line terminator in the expression and instead want to work with

    value = ( select max( tbl.column_name ) from table tbl where trim(colum2)='value1'
    and (trim((colum3)))='value3')
    
    

    you may use this \([ \n]*?(select[ \n]*?max\([ \n].*?column_name[ \n]\)[\D\d \n]+)[ \n]*?\). But I highly recommend against so, because if you feed in multiple value = ( select..... ) to the same pattern, it will not work.

    Here's a live demo