snakemakebcftools

Combine shell command lines in snakemake


I would like to combine two command lines as one single to avoid the intermediate files.

workdir: "/path/to/workdir/"

rule all:
    input: 
        "my.filtered.vcf.gz"

rule bedtools:
    input:
        invcf="/path/to/my.vcf.gz",
        bedgz="/path/to/my.bed.gz"
    output:
        outvcf="my.filtered.vcf.gz"
    shell:
        "/Tools/bedtools2/bin/bedtools intersect -a {input.invcf} -b {input.bedgz} -header -wa |"
        "/Tools/bcftools/bcftools annotate -c CHROM,FROM,TO,GENE -h <(echo '##INFO=<ID=GENE,Number=1,Type=String,Description="Gene name">') > {output.outvcf}"

I am getting invalid syntax error. I would appreciate if you could explain how to combine multiple shell lines in snakemake.


Solution

  • You probably get an invalid syntax because of the " you use in your shell here: Description="Gene name">. This closes your shell. You can either escape these quotes or use the """ syntax:

    rule bedtools:
        input:
            invcf="/path/to/my.vcf.gz",
            bedgz="/path/to/my.bed.gz"
        output:
            outvcf="my.filtered.vcf.gz"
        shell:
            "/Tools/bedtools2/bin/bedtools intersect -a {input.invcf} -b {input.bedgz} -header -wa |"
            "/Tools/bcftools/bcftools annotate -c CHROM,FROM,TO,GENE -h <(echo '##INFO=<ID=GENE,Number=1,Type=String,Description=\"Gene name\">') > {output.outvcf}"
    

    or

    rule bedtools:
        input:
            invcf="/path/to/my.vcf.gz",
            bedgz="/path/to/my.bed.gz"
        output:
            outvcf="my.filtered.vcf.gz"
        shell:
            """
            /Tools/bedtools2/bin/bedtools intersect -a {input.invcf} -b {input.bedgz} -header -wa | /Tools/bcftools/bcftools annotate -c CHROM,FROM,TO,GENE -h <(echo '##INFO=<ID=GENE,Number=1,Type=String,Description="Gene name">') > {output.outvcf}
            """
    

    Note that you can use multi line with """. Example without pipes:

    shell:
        """
        bedtools .... {input} > tempFile 
        bcftools .... tempFile > tempFile2
        whatever .... tempFile2 > {output}
        """