regexbashawkgawk

awk: function to escape regex operators from a string


Need a function to escape a string containing regex expression operators in an awk script.

I came across this 'ugly' solution:

function escape_string( str )
{
    gsub( /\\/, "\\\\",  str );
    gsub( /\./, "\\.", str );
    gsub( /\^/, "\\^", str );
    gsub( /\$/, "\\$", str );
    gsub( /\*/, "\\*", str );
    gsub( /\+/, "\\+", str );
    gsub( /\?/, "\\?", str );
    gsub( /\(/, "\\(", str );
    gsub( /\)/, "\\)", str );
    gsub( /\[/, "\\[", str );
    gsub( /\]/, "\\]", str );
    gsub( /\{/, "\\{", str );
    gsub( /\}/, "\\}", str );
    gsub( /\|/, "\\|", str );

    return str;
}

Any better ideas?


Solution

  • You can just use single gsub using a character class like this:

    function escape_string( str ) {
       gsub(/[\\.^$(){}\[\]|*+?]/, "\\\\&", str)
       return str
    }
    

    & is back-reference to the matched string and \\\\ is for escaping the match.