I'm currently working on a project that involves me using snakemake to run svaba, a variant caller, on genome data. svaba run can take multiple sample files but requires a flag in front of each file.
For example: svaba -g.... -t s1.bam -t s2.bam -t s3.bam
How do I go about setting this up in Snakemake? Here is some mock up code. There are probably so syntax errors but the idea is there
SAMPLES = ['1', '2', '3', '4']
rule svaba_run:
input:
ref="references/hg19.fa",
bam=expand("sample{sample}.bam", sample=SAMPLES)
output:
indels="test.svaba.indel.vcf",
sv="test.svaba.sv.vcf"
shell:
"svaba run -g {input.ref} -t {input.bam}"
Right now this would just try and run the command like so
svaba run -g references/hg19.fa -t sample1.bam sample2.bam sample3.bam sample4.bam
How do I get this to run with the '-t' flag in front of each sample?
Since you can use regular Python code in a Snakefile, you can use that to create the string you need in a parameter by joining a list of the desired input files with the prefix you need, like so:
SAMPLES = ['1', '2', '3', '4']
rule svaba_run:
input:
ref="references/hg19.fa",
bam=expand("sample{sample}.bam", sample=SAMPLES)
params:
sample_bams = " -t ".join([f"sample{sample}.bam" for sample in SAMPLES])
output:
indels="test.svaba.indel.vcf",
sv="test.svaba.sv.vcf"
shell:
"svaba run -g {input.ref} {params.sample_bams}"