bashshellawkgsubgawk

Awk, gsub, ampersands and unexpected expansion


First, apologies for the potentially duplicate question. I'm new to bash scripting and I can't even figure out some keywords to search with. With that said, I tried to simplify problem description as much as I can:

I have a text file (test.txt) that contains only this line:

REPLACE

I ran the following command which is supposed to replace file's text (i.e REPLACE) with code variable value if (A & B).

code="if (A & B)" ; awk -v var="${code}" '{ gsub(/REPLACE/, var); print }' test.txt

Expected output I expect code variable value to be printed as is:

if (A & B)

Actual output somehow the ampersand is expanded into 'REPLACE', which is gsub regexp parameter:

if (A REPLACE B)

Perhaps I need to escape the ampersand but unfortunately, code variable population is out of my control, so I can't manipulate its value manually.

FYI awk version is "GNU Awk 4.1.4, API: 1.1 (GNU MPFR 3.1.5, GNU MP 6.1.2)"

Thanks!


Solution

  • & is a backreference metacharacter in many tools and it means "the string that matched the regexp you searched for". If you're trying to use literal strings then use literal strings instead of regexps and backreferences.

    e.g.:

    code="if (A & B)"
    awk -v old="REPLACE" -v new="$code" 's=index($0,old){$0=substr($0,1,s-1) new substr($0,s+length(old))} 1' test.txt
    

    The alternative, trying to santize regexps and replacements, is complicated and error prone and generally is not for the faint of heart, see: Is it possible to escape regex metacharacters reliably with sed