ubuntubioinformaticssnakemake

Snakemake how to move files created to a new directory of same name


Some code taken from: Snakemake - How do I use each line of a file as an input?

def read_tissues_output():
    with open('accessions.txt') as f:
        samples = [sample for sample in f.read().split('\n') if len(sample) > 0]  # we dont want empty lines
        return expand("{sample}.txt", sample=samples)


rule all:
    input:
        read_tissues_output()
         
         
rule create_directory:
    output:
        "{n}/{n}.txt"
    shell:
        """
        echo this is a {wildcards.n} > {output}
        """



accessions.txt is a txt file with different accession numbers per line. I want to generate a new directory of the same name as the text file being generated, and move the text file generated in that directory.

When the output is output:"{n}.txt", each text file is being generated at the same directory of the Snakefile, which is good. but when I do as above, the expected result is the .txt file of different accession number is moved a folder of its own name, but instead I get an error saying:

"Missing input files for rule all: affected files: SRRXXXXX.txt"

not sure why this is the case.


Solution

  • You need to modify your read_tissues_output() function, specifically with this as the return value:

    return expand("{sample}/{sample}.txt", sample=samples)
    

    Snakemake builds chains of rules by matching output file names to input file names, so in general if you want to change the output of any rule you also need to change the input of the rule (or rules) that link to that output. In this case the all rule gets its list of inputs from your function so the function needs to change.