I have a process generating two files that I am interested in, hitsort.cls and contigs.fasta. I output these using publishdir:
process RUN_RE {
publishDir "$baseDir/RE_output", mode: 'copy'
input:
file 'interleaved.fq'
output:
file "${params.RE_run}/seqclust/clustering/hitsort.cls"
file "${params.RE_run}/contigs.fasta"
script:
"""
some_code
"""
}
Now, I need these two files to be an input for another process but I don't know how to do that.
I have tried calling this process with
NEXT_PROCESS(params.hitsort, params.contigs)
while specifying the input as:
process NEXT_PROCESS {
input:
path hitsort
path contigs
but it's not working, because only the basename is used instead of the full path. Basically what I want is to wait for RUN_RE to finish, and then use the two files it outputs for the next process.
Best to avoid accessing files in the publishDir, since:
Files are copied into the specified directory in an asynchronous manner, thus they may not be immediately available in the published directory at the end of the process execution. For this reason files published by a process must not be accessed by other downstream processes.
The recommendation is therefore to ensure your processes only access files in the working directory, (i.e. ./work
). What this means is: it's best to avoid things like absolute paths in your input and output declarations. This will also help ensure your workflows are portable.
nextflow.enable.dsl=2
params.interleaved_fq = './path/to/interleaved.fq'
params.publish_dir = './results'
process RUN_RE {
publishDir "${params.publish_dir}/RE_output", mode: 'copy'
input:
path interleaved
output:
path "./seqclust/clustering/hitsort.cls", emit: hitsort_cls
path "./contigs.fasta", emit: contigs_fasta
"""
# do something with ${interleaved}...
ls -l "${interleaved}"
# create some outputs...
mkdir -p ./seqclust/clustering
touch ./seqclust/clustering/hitsort.cls
touch ./contigs.fasta
"""
}
process NEXT_PROCESS {
input:
path hitsort
path contigs
"""
ls -l
"""
}
workflow {
interleaved_fq = file( params.interleaved_fq )
NEXT_PROCESS( RUN_RE( interleaved_fq ) )
}
The above workflow block is effectively the same as:
workflow {
interleaved_fq = file( params.interleaved_fq )
RUN_RE( interleaved_fq )
NEXT_PROCESS( RUN_RE.out.hitsort_cls, RUN_RE.out.contigs_fasta )
}