I have a snakemake rule where I call a python script with the script
keyword:
rule merge_results:
input:
[...]
output:
path_results
script:
"merge_results.py"
However, I would like to also call an R script in this rule, depending on a condition. I wanted to do something like this:
rule merge_results:
input:
[...]
output:
path_results
script:
"merge_results.py"
if cond:
"my_script.R"
but Snakemake won't allow it. I know I could call my scripts by using the run
and shell()
keywords, but my python and R scripts use a lot of times the snakemake.input
variable, so I would have to change a lot of things in their code in order to call them in a different way. Do you know if I can avoid that and use the script:
keyword with multiples files ?
It is not possible to run multiple scripts using the script
directive.
I think you should reconsider your approach. Do these scripts depend on each other? As in, does one need to run before the other? Do they write to the same output file, or do they generate separate results? Would it be possible to split this into two separate rules and define dependencies between them? You could use a 'flag file' for this, i.e. let one rule create an empty output file that is marked as temp()
that is then used as input for the other rule. Or if the scripts do in fact produce different output files, you can use a target rule that aggregrates these files by listing them in the input
directive.
If you are able to split this into two rules, you could use the param
directive to compute the condition for the if statement, and then within my_script.R
check on I believe snakemake@config['myParam']
to determine if you want to run the logic in the script or not. However, since Snakemake does expect to find an output file from the rule after execution completes, you would have to create an empty file when the if
condition evaluated to false
.
If you are unable to split this into multiple rules, one way to go about this would be to use the run
directive in combination with shell()
, and providing input and output as command line arguments:
rule merge_results:
input:
[...]
output:
path_results
run:
shell("python3 merge_results.py -i {input} -o {output}")
if cond:
shell("Rscript my_script.R -i {input} -o {output}")
To make this work in python scripts you will need argparse, or if you are fine with less flexibility then you could ommit the -i
and -o
flags and use sys.argv
to get the arguments instead. For R scripts, you can use optparse to get the input using flags, or commandArgs
for use without flags. Either way, you will have to make edits to your scripts to remove the snakemake.input
references and replace them with the arguments provided from the command line.