I have ten directories, and each directory has around 10-12 bam files. I need to use picard package to merge them together and I want to find a way to do it better.
basic command:
java -jar picard.jar MergeSamFiles \
I=input_1.bam \
I=input_2.bam \
O=merged_files.bam
directory 1:
java -jar picard.jar MergeSamFiles \
I=input_16.bam \
I=input_28.bam \
I=input_81.bam \
I=input_34.bam \
... \
... \
I=input_10.bam \
O=merged_files.bam
directory 2:
java -jar picard.jar MergeSamFiles \
I=input_44.bam \
I=input_65.bam \
I=input_181.bam \
I=input_384.bam \
... \
... \
I=input_150.bam \
O=merged_files.bam
How can I add the Input by using variable if they are not in sequential, and I would like to do the for loop of those ten directories but they contain different number of bam files.
Should I use python or R to do it or keep on using shell script ? Please advice.
Why not use samtools?
for folder in my_bam_folders/*; do
samtools merge $folder.bam $folder/*.bam
done
In general, samtools merge
can merge all the bam
files in a given directory like this:
samtools merge merged.bam *.bam
EDIT: If samtools isn't an option and you have to use Picard, what about something like this?
for folder in my_bam_folders/*; do
bamlist=$(for f in $folder/*.bam; do echo -n "I=$f " ; done)
java -jar picard.jar MergeSamFiles $bamlist O=$folder.bam
done