I want to create a conda environment for a tool, activate it and use the tool in a snakemake rule. I've indicated it as follows:
Snakemake rule:
rule fastqc:
input:
#fastq=expand("fastq_dir/{sample}_R{pair}_001.fastq.gz", sample=config["samples"], pair=config["fastq_pairs"])
fastq_path = lambda wildcards: get_full_path(wildcards.sample)
output:
#html=expand("fastqc_dir/{sample}_R{pair}_001_fastqc.html", sample=config["samples"], pair=config["fastq_pairs"]),
#zip=expand("fastqc_dir/{sample}_R{pair}_001_fastqc.zip", sample=config["samples"], pair=config["fastq_pairs"])
os.path.join("{output_dir}", "{sample}_fastqc.html")
params:
outdir="{output_dir}",
sample_adapter=os.path.join("../data/adapters", "{sample}.txt")
log:
log_file=os.path.join("{output_dir}", "local_log", \
"run_FastQC_{sample}.log")
resources:
threads = 4,
mem_mb = 24000,
runtime = "2h"
benchmark:
os.path.join("{output_dir}", "cluster_log", "run_FastQC_{sample}.benchmark.log")
conda:
"envs/fastqc.yaml"
shell:
"""
conda activate fastqc
fastqc {input} \
threads {resources.threads} \
--outdir {params.outdir} \
--kmers 7 \
--adapters {params.sample_adapter} \
&> {log.log_file}
"""
The config file is:
name: fastqc
channels:
- conda-forge
- bioconda
dependencies:
- fastqc=0.12.1-0
prefix: ./.conda_myproject/envs
When I run the snakemake, my jobs fail with a following error:
EnvironmentNameNotFound: Could not find conda environment: fastqc
Indeed, when I look to see if the environment was created in the indicated location, I don't see the fastqc environment. Instead, I see an environment with the name:
f2b1d4b45d38fce47f79239411ceb3a4_
Inside .snakemake/conda/
within my working directory.
I have tried it many times now and it fails. I install the conda environment inside the project directory rather than my home directory. I was wondering if you could help me figure this out. Thank you!
You don't need to activate the environment before running the command, see the examples in the tutorial, so this should work:
rule fastqc:
input:
#fastq=expand("fastq_dir/{sample}_R{pair}_001.fastq.gz", sample=config["samples"], pair=config["fastq_pairs"])
fastq_path = lambda wildcards: get_full_path(wildcards.sample)
output:
#html=expand("fastqc_dir/{sample}_R{pair}_001_fastqc.html", sample=config["samples"], pair=config["fastq_pairs"]),
#zip=expand("fastqc_dir/{sample}_R{pair}_001_fastqc.zip", sample=config["samples"], pair=config["fastq_pairs"])
os.path.join("{output_dir}", "{sample}_fastqc.html")
params:
outdir="{output_dir}",
sample_adapter=os.path.join("../data/adapters", "{sample}.txt")
log:
log_file=os.path.join("{output_dir}", "local_log", \
"run_FastQC_{sample}.log")
resources:
threads = 4,
mem_mb = 24000,
runtime = "2h"
benchmark:
os.path.join("{output_dir}", "cluster_log", "run_FastQC_{sample}.benchmark.log")
conda:
"envs/fastqc.yaml"
shell:
"""
fastqc {input} \
threads {resources.threads} \
--outdir {params.outdir} \
--kmers 7 \
--adapters {params.sample_adapter} \
&> {log.log_file}
"""