rworkflowwildcardsnakemakerscript

How can I incorporate wildcards in Snakemake in an R script?


I am facing the following issue:

I have a rule in Snakemake that looks something like this:

rule somerule:
input:
    tables = expand("results/{tables}_table.txt", tables = ["1", "2"]),
output:
    edit= expand("results/{tables}_edited.txt", tables = ["1", "2"]),
conda:
    "../envs/r.yaml"
script:
    "../scripts/script.R"

The script used in this rule is as follows:

library()

tab <- read.table(snakemake@input[["tables"]], sep = "\t", header = TRUE)

edit <- some_function(tab)

write.table(edit, file = snakemake@output[["edit"]], sep = "\t", quote = FALSE, row.names = TRUE, col.names = TRUE)

When I execute the code, I get this error:

Error in file(file, 'rt'): invalid 'description' argument
Called from: read.table -> file
Execution halted

However, when I input the data individually (not as wildcards), the script works fine and produces the correct output files. Where could the problem be? My suspicion is that Snakemake is passing all the input data to the R script simultaneously instead of running them individually, but I'm not sure if that's the case and even less sure how to fix it.


Solution

  • A nice person on another website replied to me. I will leave his reply here, maybe it will be useful to someone.

    " Your problem is that Snakemake is passing all the input data to the R script simultaneously instead of running them individually :)

    As written your rule takes a list of input files and gives a list of output files (expand is just a formatting helper function that returns a list based on its arguments). Instead maybe have one rule that calls the R script with one input and one output, and another rule that asks for all those outputs as an easy way run the whole thing at once.

    Maybe something like this? I changed tables to table here just to emphasize it's one by one.

    rule all:
        input: expand("results/{table}_edited.txt", table = ["1", "2"])
    
    rule somerule:
        input:
            table = "results/{table}_table.txt"
        output:
            edit = "results/{table}_edited.txt"
        conda: "../envs/r.yaml"
        script: "../scripts/script.R"
    `
    

    "