bashshell

Why does command substitution change the behaviour of a name referenced parameter of a function?


Background

I want to output a string while updating a variable using a name reference. I came up with something like this:

#!/bin/bash

function update_name_ref_and_echo () {
    local -n name_ref=$1
    name_ref="new value"
    echo "bar"
}

foo="old value"
echo "foo before: $foo"

update_name_ref_and_echo 'foo'
echo "foo after: $foo"

It works perfectly as expected:

foo before: old value
bar
foo after: new value

The problem

However, when I try to capture the string echoed from the function using command substitution, the name reference no longer works.

#!/bin/bash

function update_name_ref_and_echo () {
    local -n name_ref=$1
    name_ref="new value"
    echo "bar"
}

foo="old value"
echo "foo before: $foo"

capture=$(update_name_ref_and_echo 'foo')
echo "capture: $capture"
echo "foo after: $foo"

generates:

foo before: old value
capture: bar
foo after: old value

The question

Further information

My bash version seems to be relative recent.

❯ bash --version
GNU bash, version 5.2.26(1)-release (x86_64-redhat-linux-gnu)

Update

The behaviour of directly modifying global variable also changed by command substitution.

#!/bin/bash

GLOBAL_FOO="old value"
echo "GLOBAL_FOO before: $GLOBAL_FOO"

function update_global_and_echo () {
    GLOBAL_FOO="new value"
    echo "bar"
}

update_global_and_echo
echo "GLOBAL_FOO after: $GLOBAL_FOO"

generates:

GLOBAL_FOO before: old value
bar
GLOBAL_FOO after: new value

While

#!/bin/bash

GLOBAL_FOO="old value"
echo "GLOBAL_FOO before: $GLOBAL_FOO"

function update_global_and_echo () {
    GLOBAL_FOO="new value"
    echo "bar"
}

capture=$(update_global_and_echo 'foo')
echo "capture: $capture"
echo "GLOBAL_FOO after: $GLOBAL_FOO"

generates:

GLOBAL_FOO before: old value
capture: bar
GLOBAL_FOO after: old value

Solution

  • capture=$(update_name_ref_and_echo 'foo') executes update_name_ref_and_echo 'foo' in a subshell and so foo only has the new value within that subshell, not after the subshell dies.

    Assuming you don't want to (or can't) change the definition of update_name_ref_and_echo, you can avoid executing it in a subshell using either of these approaches:

    1. With a coprocess:
      coproc CAT { cat; } || exit 1
      trap 'kill -9 "$CAT_PID"; exit' EXIT
      
      update_name_ref_and_echo 'foo' >&"${CAT[1]}" &&
      IFS= read -r capture <&"${CAT[0]}"
      
    2. With a temp file:
      tmp=$(mktemp) || exit 1
      trap 'rm -f "$tmp"; exit' EXIT
      
      update_name_ref_and_echo 'foo' >"$tmp" &&
      IFS= read -r capture <"$tmp"
      

    or you can call update_name_ref_and_echo in a subshell but print the resultant value of foo from within the subshell then read all NUL-terminated output from the subshell into the current shell (assuming the function doesn't output any NUL chars):

    1. With readarray:

      readarray -d '' -t arr < <(update_name_ref_and_echo 'foo'; printf '\0%s\0' "$foo")
      capture="${arr[0]%$'\n'}"
      foo="${arr[1]}"
      

      The capture="${arr[0]%$'\n'}" is to ensure any newline that update_name_ref_and_echo prints at the end of it's stdout gets removed when populating capture. You could alternatively do IFS= read -r capture <<< "${arr[0]}" if you prefer but that'd probably create a temp file and so be slower.

    2. With a block of reads:

      {
          IFS= read -r -d '' capture
          capture="${capture%%$'\n'}"
          IFS= read -r -d '' foo
      } < <(update_name_ref_and_echo 'foo'; printf '\0%s' "$foo")
      

      which you can use a function like this for if you prefer not to write IFS= read -r -d '' ... multiple times:

      readem() {
          local arg
          for arg; do
              local -n var="$arg"
              IFS= read -r -d '' var
              var="${var%%$'\n'}"
          done
      }
      
      readem capture foo < <(update_name_ref_and_echo 'foo'; printf '\0%s\n' "$foo")
      

      It usually isn't an issue but FYI the read/readarrays above would only strip 1 trailing newline from each variables contents while var=$(cmd) would strip all of them. If you might have multiple newlines at the end and want to strip all of them, if present, you could change that readem() function to this or similar (depending on your requirements for handling that):

      readem() {
          local arg
          local re=$'(.*[^\n])?\n*$'
          for arg; do
              local -n var="$arg"
              IFS= read -r -d '' var
              [[ "$var" =~ $re ]] &&
                  var="${BASH_REMATCH[1]}"
          done
      }