bioinformatics

Automation for paired end reads with Cutadapt


I am trying to automate my paired-end reads with cutadapt, but I keep encountering the same issue - the adapter is trimmed from the forward reads, but not from the reverse. Even after modifying the code according to the documentation, the problem remains. If I only trim the forward or the reverse alone, it works, but not as a paired-end job.

This is my code:

cat ids.txt | parallel 'cutadapt -j 24 -a AGATCGGAA --interleaved {}_R1.fq.gz {}_R2.fq.gz | cutadapt -j 24 -a AGATCGGAA --interleaved -o {}clipped_R1.fq.gz -p {}clipped_R2.fq.gz -'

Does anyone have a tip on how to modify this code to make it work? What am I doing wrong?


Solution

  • Check carefully the cutadapt documentation, there's a specific chapter regarding paired end alignments. You are looking for -A.

    You are also messing things up with the --interleaved parameter: if the reads are interleaved, why are you giving the two ends? I'm not sure what are you trying to achieve, but I bet you have an extra cutadapt invocation.

    I guess you are trying something like:

    cat ids.txt | parallel 'cutadapt -j 24 -a AGATCGGAA -A <proper_adaptor> -o {}clipped_R1.fq.gz -p {}clipped_R2.fq.gz {}_R1.fq.gz {}_R2.fq.gz'