I would like to combine two command lines as one single to avoid the intermediate files.
workdir: "/path/to/workdir/"
rule all:
input:
"my.filtered.vcf.gz"
rule bedtools:
input:
invcf="/path/to/my.vcf.gz",
bedgz="/path/to/my.bed.gz"
output:
outvcf="my.filtered.vcf.gz"
shell:
"/Tools/bedtools2/bin/bedtools intersect -a {input.invcf} -b {input.bedgz} -header -wa |"
"/Tools/bcftools/bcftools annotate -c CHROM,FROM,TO,GENE -h <(echo '##INFO=<ID=GENE,Number=1,Type=String,Description="Gene name">') > {output.outvcf}"
I am getting invalid syntax error. I would appreciate if you could explain how to combine multiple shell lines in snakemake.
You probably get an invalid syntax because of the "
you use in your shell here: Description="Gene name">
. This closes your shell. You can either escape these quotes or use the """
syntax:
rule bedtools:
input:
invcf="/path/to/my.vcf.gz",
bedgz="/path/to/my.bed.gz"
output:
outvcf="my.filtered.vcf.gz"
shell:
"/Tools/bedtools2/bin/bedtools intersect -a {input.invcf} -b {input.bedgz} -header -wa |"
"/Tools/bcftools/bcftools annotate -c CHROM,FROM,TO,GENE -h <(echo '##INFO=<ID=GENE,Number=1,Type=String,Description=\"Gene name\">') > {output.outvcf}"
or
rule bedtools:
input:
invcf="/path/to/my.vcf.gz",
bedgz="/path/to/my.bed.gz"
output:
outvcf="my.filtered.vcf.gz"
shell:
"""
/Tools/bedtools2/bin/bedtools intersect -a {input.invcf} -b {input.bedgz} -header -wa | /Tools/bcftools/bcftools annotate -c CHROM,FROM,TO,GENE -h <(echo '##INFO=<ID=GENE,Number=1,Type=String,Description="Gene name">') > {output.outvcf}
"""
Note that you can use multi line with """
. Example without pipes:
shell:
"""
bedtools .... {input} > tempFile
bcftools .... tempFile > tempFile2
whatever .... tempFile2 > {output}
"""