bashloopsgenetics

Code not working, when included in a loop


I have a directory full of paired input files (80 samples, so 160 files in total). An example of a paired input is:

G49Am24_1_100_a100_1.fq.gz
G49Am24_1_100_a100_2.fq.gz

All input pairs will have _1.fq.gz and _2.fq.gz at the end.

I'm using trimgalore, which is a tool for cleaning genetic data. When I run the code to clean a pair of files from within the directory, it works perfectly:

trim_galore --length 40 --quality 25 --paired ./G49Am24_1_100_a100_1.fq.gz ./G49Am24_1_100_a100_2.fq.gz

I'd like to run a loop that will clean all of the pairs of files. This is my first go at writing a loop, and I came up with:

for infile in *_1.fq.gz ; do
   base=$(basename ${infile} _1.fq.gz) > trim_galore --length 40 --quality 25 --paired ${infile} ${base}_2.fq.gz
done

From the code above, I get the error message '--length: command not found' (multiple times).

Any ideas?


Solution

  • Your syntax is incorrect. > is for redirection. What you're doing right now is setting a variable to base, creating an empty file called trim_galore, and then running a nonexistent command --length.

    for infile in *_1.fq.gz; do
        base=$(basename "$infile" _1.fq.gz)
        second="${base}_2.fq.gz"
        trim_galore --length 40 --quality 25 --paired "$infile" "$second"
    done
    

    You could also use string substitution instead of basename:

    for infile in *_1.fq.gz; do
        trim_galore --length 40 --quality 25 --paired "${infile}" "${infile/1.fq.gz}2.fq.gz"
    done