bashparsing

Parsing optional and not optional arguments


I am new with bash and after reading and trying a lot about how to parse arguments I cannot what I really want to do I want to parse optional and not optional arguments. More specifically I want to parse 3 arguments, first (a fastaq file) second (a second optional fastaq file) a third argument that will be a directory.

my_script.sh -f1 file1.fasta --f2 file2.fasta -d/home/folder1/folder2 or

my_script.sh -f1 file1.fasta -d /home/folder1/folder2

I have tried to do this in many ways but I dont know how to let the program identifies when there are two fasta files and a directory and, when there is only one fasta file and a directory.

With this arguments I want to save them in variables because they will be used later by third parties.

I have tried this:



for i in "$@"; do
 case $i in
   -f1=|-fasta1=)
     FASTA1="${i#=}"
     shift # past argument=value
     ;;
   -d) DIRECTORY=$2
  shift 2
     ;;
   -d=|-directory=) DIRECTORY="${i#=}"
   shift # past argument=value
     ;;
   --f2=|-fasta2=) FASTA2="${i#*=}"
    shift # past argument=value
     ;;
   *)
     ;;
 esac
done

But I just got this

scripts_my_first_NGS]$ ./run.sh -f1 fasta.fasta -d /home/folder1
FASTA1  =
DIRECTORY     =
FASTA2     =

Solution

  • Basically you need to add a separate parser for versions of the options where they aren't used with the equal sign.

    Also your shift commands are useless since you're processing a for loop. So convert it to to a while [[ $# -gt 0 ]]; do loop instead.

    I also added a few modifications which I suggest be added.

    while [[ $# -gt 0 ]]; do
        case $1 in
        -f1|-fasta1)
            FASTA1=$2
            shift
            ;;
        -f1=*|-fasta1=*)
            FASTA1=${1#*=}
            ;;
        -d|-directory)
            DIRECTORY=$2
            shift
            ;;
        -d=*|-directory=*)
            DIRECTORY=${1#*=}
            ;;
        -f2|fasta2)
            FASTA2=$2
            shift
            ;;
        -f2=*|-fasta2=*)
            FASTA2=${1#*=}
            ;;
        -*)
            echo "Invalid option: $1" >&2
            exit 1
            ;;
        --)
            # Do FILES+=("${@:2}") maybe
            break
            ;;
        *)
            # TODO
            # Do FILES+=("$1") maybe
            ;;
        esac
    
        shift
    done
    

    The "parser" for the with-equal and non-with-equal versions of the options can also be unified by using a helper function:

    function get_opt_arg {
        if [[ $1 == *=* ]]; then
            __=${1#*=}
            return 1
        elif [[ ${2+.} ]]; then
            __=$2
            return 0 # Tells that shift is needed
        else
            echo "No argument provided to option '$1'." >&2
            exit 1
        fi
    }
    
    while [[ $# -gt 0 ]]; do
        case $1 in
        -d|-directory|-d=*|-directory=*)
            get_opt_arg "$@" && shift
            DIRECTORY=$__
            ;;
        -f1|-fasta1|-f1=*|-fasta1=*)
            get_opt_arg "$@" && shift
            FASTA1=$__
            ;;
        -f2|fasta2|-f2=*|-fasta2=*)
            get_opt_arg "$@" && shift
            FASTA2=$__
            ;;
        -*)
            echo "Invalid option: $1" >&2
            exit 1
            ;;
        --)
            # Do FILES+=("${@:2}") maybe
            break
            ;;
        *)
            # TODO
            # Do FILES+=("$1") maybe
            ;;
        esac
    
        shift
    done
    

    Update

    I found a complete solution to command-line parsing without relying on getopt[s] and it does it even more consistently: https://konsolebox.github.io/blog/2022/05/14/general-command-line-parsing-solution-without-using-getopt-s.html