regexbashcommand-linequotesgnu-parallel

Single quotes within single quotes issue - Bash command line GNU parallel


I am new to Bash command line use, and need to know the correct syntax for doing single quotes within existing single quotes.

ls *file.fa | parallel -j4 'perl -pe 's/^>/>{}/' {} >newfile_{}'

I know the GNU parallel command is not particularly well known or used but i don't think the syntax would be different for a different command that requires single quotes within single quotes. The command is to change > to >file.fa (> then the file name) within the file called file.fa, where {} incorporates the file piped from the ls *file.fa section.

Any help is much appreciated


Solution

  • Quoting in GNU Parallel is a black art. There is a whole section dedicated to quoting in the manual.

    Conclusion: To avoid dealing with the quoting problems it may be easier just to write a small script or a function (remember to export -f the function) and have GNU parallel call that.

    In this case I would write a function:

    fasta_namer() {
      NAME=$1
      perl -pe "s/^>/>$NAME/" "$NAME" >newfile_"$NAME"
    }
    export -t fasta_namer
    ls *file.fa | parallel -j4 fasta_namer {}
    

    FASTA file names are usually not weird, but if they are (e.g. containing ' " \ * & / or other crazy chars) then this might solve it:

    fasta_namer() {
      NAME=$1
      PERLQUOTED=$2
      NEWNAME=$3
      perl -pe "s/^>/>$PERLQUOTED/" "$NAME" >"$NEWNAME"
    }
    export -t fasta_namer
    ls *file.fa | parallel -j4 fasta_namer {} '{=$_=quotemeta($_)=}' {.}.new.fa