bashawkshfreebsd

How can you use awk to replace a pattern with an environment variable?


I am trying to write a simple script that will receive a text via standard input and will output everything as it is except it will replace occurences following this pattern:

{{env MYVAR}} 
{{env PATH}} 
{{env DISPLAY}} 

with the contents of environment variable MYVAR, PATH, DISPLAY, etc.

I am aiming to not having to pass any parameter to this script, so it will automatically detect the patterns {{env VARNAME}} and replace by the value of environment variable $VARNAME.

The script takes its input via standard input and provides the output via standard output.

Sample input text via standard input:

This is a basic templating system that can replace environment variables in regular text files.
For example the DISPLAY in this system is {{env DISPLAY}} and the path is {{env PATH}}.

Expected output via standard output:

This is a basic templating system that can replace environment variables in regular text files.
For example the DISPLAY in this system is :0.0 and the path is /bin;/usr/bin;/usr/local/bin.

What I have tried:

So far I have managed only to get it done for one variable by passing it via command line.

#!/bin/sh

# Check if the first argument is set
if [ -z "$1" ]; then
    echo "No variable name provided." >&2
    exit 1
fi

VARIABLE_NAME="$1"

# Use awk to replace '{{env VARIABLE_NAME}}' with the value of the environment variable
awk -v var_name="$VARIABLE_NAME" '
function escape(s) {
    esc = "";
    for (i = 1; i <= length(s); i++) {
        c = substr(s, i, 1);
        if (c ~ /[.[\]$()*+?^{|\\{}]/) {
            esc = esc "\\" c;
        } else {
            esc = esc c;
        }
    }
    return esc;
}
BEGIN {
    search = "{{env " var_name "}}";
    search_esc = escape(search);
    replacement = ENVIRON[var_name];
}
{
    gsub(search_esc, replacement);
    print;
}'

So the above works but requires you to do ./parsing_script MYVAR

I want to avoid having to specify the environment variables as command line arguments.

Architecture/OS

I am using FreeBSD's awk and its POSIX shell /bin/sh

Notes

If awk is not the tool, I am open to hear solutions (please no Python or Perl).


Solution

  • The following should work with any POSIX awk. Note that it performs the substitution recursively. If environment variable A={{env B}} and environment variable B=bar, then {{env A}} will be replaced with bar.

    It use regular expression [{][{]env[[:space:]]+[A-Za-z_][A-Za-z0-9_]*[}][}] because a valid shell variable name is A word consisting only of alphanumeric characters and underscores, and beginning with an alphabetic character or an underscore. So, it will not substitute {{env 98FOO}}.

    The space between {{env and the variable name can be any mixture of tabs and whitespaces.

    #!/bin/sh
    
    cat - | awk '
    BEGIN { re = "[{][{]env[[:space:]]+[A-Za-z_][A-Za-z0-9_]*[}][}]" }
    $0 ~ re {
      s = $0
      while(match(s, re)) {
        v = substr(s, RSTART + 6, RLENGTH - 8)
        sub(/^[[:space:]]+/, "", v)
        s = substr(s, 1, RSTART - 1) ENVIRON[v] substr(s, RSTART + RLENGTH)
      }
      print s
      next
    }
    1'
    

    As mentioned in comments the recursive substitution could lead to infinite loops (e.g. with A='{{env A}}'). A version with only one pass of substitutions could be something like:

    BEGIN { re = "[{][{]env[[:space:]]+[A-Za-z_][A-Za-z0-9_]*[}][}]" }
    $0 ~ re {
      s = $0
      while(match(s, re)) {
        v = substr(s, RSTART + 6, RLENGTH - 8)
        sub(/^[[:space:]]+/, "", v)
        printf("%s%s", substr(s, 1, RSTART - 1), ENVIRON[v])
        s = substr(s, RSTART + RLENGTH)
      }
      print s
      next
    }
    1' input
    

    But of course, with A='{{env B}}' and B=bar, {{env A}} will become {{env B}}, not bar.