pythoncondasnakemakesamtoolspysam

Snakemake Conda environment does not seem to be activating though it says it is


I am running Snakemake with the --use-conda option. Snakemake successfully creates the environment, which should include pysam. I am able to manually activate this created environment, and within it, run my script split_strands.py, which imports the module pysam, with no problems. However, when running the Snakemake pipeline, I get the following error log:

Activating conda environment: /projects/ps-yeolab3/ekofman/sc_STAMP_pipeline/STAMP/workflow/.snakemake/conda/7c375b6b
/projects/ps-yeolab3/ekofman/sc_STAMP_pipeline/STAMP/workflow/scripts/split_strands.py:166: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if args.output_fwd_bam is not '-':
/projects/ps-yeolab3/ekofman/sc_STAMP_pipeline/STAMP/workflow/scripts/split_strands.py:171: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if args.output_rev_bam is not '-':
Traceback (most recent call last):
  File "/projects/ps-yeolab3/ekofman/sc_STAMP_pipeline/STAMP/workflow/scripts/split_strands.py", line 20, in <module>
    import pysam
ModuleNotFoundError: No module named 'pysam'
[Mon Mar 29 16:41:06 2021]
Error in rule split_strands:
    jobid: 0
    output: 1_split_strands/TWA1_possorted_genome_bam_MD-GTCGCGACACGAGGTA-1.bam.fwd.bam, 1_split_strands/TWA1_possorted_genome_bam_MD-GTCGCGACACGAGGTA-1.bam.rev.bam
    conda-env: /projects/ps-yeolab3/ekofman/sc_STAMP_pipeline/STAMP/workflow/.snakemake/conda/7c375b6b
    shell:
        
        python scripts/split_strands.py -i /projects/ps-yeolab3/ekofman/sc_STAMP_pipeline/STAMP/workflow/inputs/TWA1_possorted_genome_bam_MD-GTCGCGACACGAGGTA-1.bam -f 1_split_strands/TWA1_possorted_genome_bam_MD-GTCGCGACACGAGGTA-1.bam.fwd.bam -r 1_split_strands/TWA1_possorted_genome_bam_MD-GTCGCGACACGAGGTA-1.bam.rev.bam
        
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Nodes:        tscc-1-37

So as you can see, though it says it is "Activating conda environment", this does not seem to be true as subsequently the module 'pysam' is not found, which I've verified would be found when activating manually.

This is how the rule is specified:

rule split_strands:
    input: 
        input_bam=config["samples_path"]+"{sample}",
        index=config["samples_path"]+"{sample}.bai"
    output: 
        output_fwd="1_split_strands/{sample}.fwd.bam",
        output_rev="1_split_strands/{sample}.rev.bam"
    conda:
        "envs/python2.7.yaml"
    shell:
        """
        python scripts/split_strands.py -i {input.input_bam} -f {output.output_fwd} -r {output.output_rev}
        """

I have verified that the hash 7c375b6b corresponds to the appropriate env specified in python2.7.yaml.

Any ideas what might be happening? My rules are being run a cluster and submitted via qsub commands.


Solution

  • Turns out that the newer version of snakemake 6.0.0+ must have some issue with this. I used snakemake 5.8.2 instead and things work just fine. Not sure exactly what's going on under the hood but seems identical to this issue: https://github.com/snakemake/snakemake/issues/883