bashheredocpbsqsub

Array indexing problems with heredoc and PBS


I am writing a script which generates batch jobs to be run by the PBS scheduling system, using a heredoc to create the job instances. The problem is, that arrays inside the batch script can not be indexed by variables. It always returns the first element of the array. In contrast, indexing an array by an integer works as expected.

#!/bin/bash

export MY_ARRAY=(a b c)
export MY_ARRAY_LENGTH=${#MY_ARRAY[@]}

qsub -V -lselect=1:ncpus=1:mem=8gb -lwalltime=00:01:00 -N test -J 1-${MY_ARRAY_LENGTH} <<EOF

echo "PBS array index: \$PBS_ARRAY_INDEX"

I=\$((\${PBS_ARRAY_INDEX} - 1))
echo "MY_ARRAY index: \$I" 

## this works as expected, returning the second element of the array
echo "element selected by integer: ${MY_ARRAY[1]}"

## this does not work as expected, returning the first element of
## MY_ARRAY regardles of $I 
echo "element selected by variable: ${MY_ARRAY[$I]}"

EOF

The output of this for array job 3, i.e. the 3rd element is:

PBS array index: 3
MY_ARRAY index: 2
element selected by integer: b
element selected by variable: a

In the example above the expression ${MY_ARRAY[$I]} should give me the value "c". Why doesn't this work?

NB I understand that the escape sign \ is necessary for variables not to be evaluated by the calling script, but this does not seem to be the case for all of the variables. Why is that?


Solution

  • Writing such scripts is uterly hard and double-escaping is confusing. Consider a different approuch, which just sends the context to the remote side including code to execute and then executes the code. That way the code is simple. Remember to check your script with shellcheck.

    #!/bin/bash
    
    MY_ARRAY=(a b c)
    work() {
        # normal script
        echo "PBS array index: $PBS_ARRAY_INDEX"
        
        I=$((${PBS_ARRAY_INDEX} - 1))
        echo "MY_ARRAY index: $I" 
        
        ## this works as expected, returning the second element of the array
        echo "element selected by integer: ${MY_ARRAY[1]}"
        
        ## this does not work as expected, returning the first element of
        ## MY_ARRAY regardles of $I 
        echo "element selected by variable: ${MY_ARRAY[$I]}"
    
    }
    
    qsub -V -lselect=1:ncpus=1:mem=8gb -lwalltime=00:01:00 -N test -J 1-${MY_ARRAY_LENGTH} <<EOF
    $(declare -p MY_ARRAY)  # serialize variables
    $(declare -f work)      # serialize function
    work                    # excute the function
    EOF
    
    ## this does not work as expected, returning the first element of
    ## MY_ARRAY regardles of $I 
    

    Yes, I= is set on the remote side, whereas ${MY_ARRAY[$I]} is expanded on client side. I is not set on client side, so $I is empty, empty is converted to 0, and it becaomes the first element.