arraysbashshell

Convert an array stored in a string variable back to an array


This will create arr[0]=hy, arr[1]="hello man":

declare -a arr=(hy "hello man")

I have the 'hy "hello man"' this as whole string stored in variable such that:

echo "$var"
hy "hello man"

How can I split it back to an array? I tried IFS= IFS=" " readarray and all but array only results into arr[0]="hy "hello man"" and not arr[0]=hy, arr[1]="hello man"

Desired array:

arr[0]=hy arr[1]="hello man"

For testing purpose:

arg () { printf "%d args:" "$#"; [ "$#" -eq 0 ] || printf " <%s>" "$@"; echo; }

arg hy "hello man"
2 args: <hy> <hello man>

But

var='hy "hello man"'
arg $t
1 arg: <hy "hello man">

Is there a quoting trick that can help with my problem?


Solution

  • Some explanation about dsv and an alternative using sed

    In addition to Jetchisel's correct answer, there are some details.

    And as some installation don't have builtins installed, I'v posted a stronger alternative than eval using sed to parse correctly submitted string.

    1. Basic usage:

    As previously commented on other posts, eval is evil!! For showing this I will use:

    var='my "Hello world." $(</etc/passwd)'
    

    Then

    enable dsv
    dsv -d\  -a arry "$var"
    declare  -p arry
    
    declare -a arry=([0]="my" [1]="Hello world." [2]="\$(</etc/passwd)")
    

    Yes, for this use case, switches -S or -g is not required.

    2. Usage

    After run help dsv I've read a lot of explanation the most usefull paragraph is:

    Parse STRING, a line of delimiter-separated values, into individual
    fields, and store them into the indexed array ARRAYNAME starting at
    index 0. The parsing understands and skips over double-quoted strings. 
    If ARRAYNAME is not supplied, "DSV" is the default array name.
    If the delimiter is a comma, the default, this parses comma-
    separated values as specified in RFC 4180.
    ....
    The -d option specifies the delimiter. The delimiter is the first
    character of the DELIMS argument. Specifying a DELIMS argument that
    contains more than one character is not supported and will produce
    unexpected results. The -S option enables shell-like quoting: double-
    quoted strings can contain backslashes preceding special characters,
    and the backslash will be removed; and single-quoted strings are
    processed as the shell would process them. The -g option enables a
    greedy split: sequences of the delimiter are skipped at the beginning
    and end of STRING, and consecutive instances of the delimiter in STRING
    do not generate empty fields. If the -p option is supplied, dsv leaves
    quote characters as part of the generated field; otherwise they are
    removed.
    

    Shortened:

     -a ARRAYNAME Array name to populate (default: DSV)
     -d DELIMS    Separator character (one character only!)
     -S           Shell like (differences single or double quote)
     -g           Greedy (ignore leading and trailing delimiters)
     -p           Preserve quotes      
    

    3. About eval is evil

    If you try same string with one of other solution posted here:

    Suggested by pmf:

    unset arr; declare -a arr="($var)"
    declare -p arr
    

    Then command will dump my whole password file into array:

    declare -a arr=([0]="my" [1]="Hello world." [2]="root:x:0:0:root:/root:/bin/bash
    " [3]="daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin" [4]="bin:x:2:2:bin:/bin:
    /usr/sbin/nologin" [5]="sys:x:3:3:sys:/dev:/usr/sbin/nologin" [6]="sync:x:4:6553
    ... 59 more lines ...
    /usr/sbin/nologin" [122]="_lxd:x:107:117::/var/lib/lxd/:/bin/false")
    

    Or other answer, suggested by Hamidreza Shafizadeh:

    unset arr; eval "arr=($var)"
    declare -p arr
    

    Will render same array!

    So you have to be confident with variable's content source!!!

    4. Alternative using sed.

    If you don't have bash-builtins installed, and still want some security, you could use sed from Replace every comma not enclosed in a pair of double quotes:

    IFS=$'\t' read -a arr < <(
        sed -e ':a;s/^\(\("[^"]*"\|'\''[^'\'']*'\''\|[^" '\'']*\)*\) /\1\t/;ta' <<<"$var"
    )
    declare -p arr
    
    declare -a arr=([0]="hy" [1]="\"hello man\"" [2]="\$(</etc/passwd)")
    

    4.1. Using sed in a function.

    string2arry() {
        IFS=$'\t' read -a "$2" < <(
            sed -e <<<"$1" \
                ':a;s/^\(\("[^"]*"\|'\''[^'\'']*'\''\|[^" '\'']*\)*\) /\1\t/;ta'
        )
    }
    string2arry "$var" arr
    declare -p arr
    
    declare -a arr=([0]="hy" [1]="\"hello man\"" [2]="\$(</etc/passwd)")
    

    4.1b. Same, cleaning quotes

    string2arry() {
        local -i _i
        local -n _res="$2"
        IFS=$'\t' read -a _res < <(
            sed -e <<<"$1" \
                ':a;s/^\(\("[^"]*"\|'\''[^'\'']*'\''\|[^" '\'']*\)*\) /\1\t/;ta'
        )
        for _i in ${!_res[@]}; do
            lc=${_res[_i]::1} rc=${_res[_i]: -1}
            [[ $lc == "$rc" ]] && [[ -z ${lc/[\'\"]} ]] &&
                _res[_i]=${_res[_i]:1: -1}
        done
    }
    string2arry "$var" arr
    declare -p arr
    
    declare -a arr=([0]="hy" [1]="hello man" [2]="\$(</etc/passwd)")