bashreferenceassociative-arraydeclare

How to deserialize ${aa[@]@K} into referenced associative array with no eval effect?


How can I reference an associative array and reassign its content from serialized string?

(The eval requirement is now spelled out explicitly in the title, but this is natural when it comes to deserialization.)

Important: The serialized string comes as-is from another source (assume e.g. read from file), the below code demonstrates how it was created, but the variable aa_original is NOT available in my actual code.

As it was once created with ${aa[@]@K} parameter expansion, it can contain $'...' constructs:

${parameter@K} Produces a possibly-quoted version of the value of parameter, except that it prints the values of indexed and associative arrays as a sequence of quoted key-value pairs (see Arrays).

#!/bin/bash

func() {
        declare -n aa=$1
        # `aa=()` can be used to blank out the referenced array
        # before populating it, but skipping it here
        # to see when reference is lost

        # of the below, only `serialized` string is available in actual code
        declare -A aa_original=("a" "3" "b" "4") 
        serialized=$(printf '%s' "${aa_original[@]@K}")

        aa="( $serialized )"

        # below only for debugging purposes
        echo "~ at the end of func() now"
        echo "~ serialized: $serialized"
        echo "~ aa_original:"
        for key in "${!aa_original[@]}"; do printf "%s->%s " $key ${aa_original[$key]}; done
        echo
        echo "~ aa:"
        for key in "${!aa[@]}"; do printf "%s->%s " $key ${aa[$key]}; done
        echo
        echo "~ leaving func() now"
}

declare -A AA=("a" "1" "b" "2")

func AA

echo "${AA["a"]} ~ ${AA["b"]}"

The above was expected to display 3 ~ 4 at the end, but instead gives 1 ~ 2.

The reason is that the global AA gets mangled:

~ at the end of func() now
~ serialized: b "4" a "3" 
~ aa_original:
b->4 a->3 
~ aa:
0->( b->"4" a->"3" )-> b->2 a->1 
~ leaving func() now

NOTE: As I have since discovered thanks to @pjh, this behavior is to be expected.

So I suppose I am left with using declare -A explicitly:

declare -A aa="( $serialized )"

The assignment works, but it loses the reference attribute, clearly:

~ at the end of func() now
~ serialized: b "4" a "3" 
~ aa_original:
b->4 a->3 
~ aa:
b->4 a->3 
~ leaving func() now
1 ~ 2

Why does this happen?


Since I posted my question, I learned (thanks again, @pjh) that using eval on the assignment (which also works) has the same side-effect as using declare (see my own answer below).

So I am still with the original question how to process the serialized output of ${aa[@]@K} without eval effect.

I know I can use different serialization method, but I wonder about the specific output as-is.


For having it on record stand out, this works:

#!/bin/bash

func() {
        declare -n aa=$1

        serialized=$(declare -A aa_original=("a" "3" "b" "4"); printf '%s' "${aa_original[@]@K}")

        #create local interim aarray
        declare -A aa_local="( $serialized )"

        # blank out and refill the referenced aaray
        aa=()
        for key in "${!aa_local[@]}"; do aa["$key"]="${aa_local[$key]}"; done
}

declare -A AA=("a" "1" "b" "2")

func AA

echo "${AA["a"]} ~ ${AA["b"]}"

As I have learned from @pjh, it is as "bad" as using eval, so without the extra step this is equivalent to:

#!/bin/bash

func() {
        declare -n aa=$1

        serialized=$(declare -A aa_original=("a" "3" "b" "4"); printf '%s' "${aa_original[@]@K}")

        aa=()
        eval aa="( $serialized )"
}

declare -A AA=("a" "1" "b" "2")

func AA

echo "${AA["a"]} ~ ${AA["b"]}"

This is as simple as it gets already (without double assignment), but unfortunately performs an eval. It supports the output of ${aa_original[@]@K} including $'...' well though.

I kind of hoped this would have been possible without eval and I did not realize that declare -A already had that effect, but it does not appear so.


Solution

  • UPDATED based on OP's latest comments ...

    Assumptions:

    NOTE: we'll make use of the fact that xargs treats the contents between a pair of double quotes as a single field

    One approach:

    func() {
        declare -n aa="$1"                      # named ref for array to be repopulated
        local ss="$2"                           # serialized string
    
        aa=()                                   # wipe array
    
        while read -r key                       # read current line of input
        do
            read -r val                         # read next line of input
            aa[$key]="$val"                 # repopulate array
        done < <(echo "${ss}" | xargs -n1)      # let xargs split ss into one item per line
    }
    

    Utility function to display contents of an array:

    print_arr() {
        declare -n arr="$1"
        local arr_name="$1"
    
        while read -r key
        do
            echo "${arr_name}[$key] = ${arr[$key]}"
        done < <(printf "%s\n" "${!arr[@]}" | sort)
    }
    

    Taking for a test drive:

    declare -A AA
    
    for sstring in '"a" "3" "b" "4"'   '"a to z" "22" "1 thru 3" "33"'   'c 5 d "1 2 3"'
    do
        printf "\n############### serialized string: '%s'\n" "${sstring}"
        printf "\n####### before\n"
    
        AA=("a" "1" "b" "2")
        print_arr AA
    
        printf "####### after\n"
        func AA "${sstring}"
        print_arr AA
    done
    

    This generates:

    ############### serialized string: '"a" "3" "b" "4"'
    
    ####### before
    AA[a] = 1
    AA[b] = 2
    ####### after
    AA[a] = 3
    AA[b] = 4
    
    ############### serialized string: '"a to z" "22" "1 thru 3" "33"'
    
    ####### before
    AA[a] = 1
    AA[b] = 2
    ####### after
    AA[1 thru 3] = 33
    AA[a to z] = 22
    
    ############### serialized string: 'c 5 d "1 2 3"'
    
    ####### before
    AA[a] = 1
    AA[b] = 2
    ####### after
    AA[c] = 5
    AA[d] = 1 2 3