I've this structure:
|-- combine.nf
|-- first
| |-- a.txt
| |-- b.txt
| `-- c.txt
|-- second
| |-- A.txt
| `-- B.txt
And combine.nf
is
#!/usr/bin/env nextflow
process sayHello {
input:
path first
path second
output:
stdout
script:
"""
echo 'Hello couple (${first}, ${second})'
"""
}
workflow {
def files_first = Channel.fromPath("first/*.txt")
def files_second = Channel.fromPath("second/*.txt")
sayHello(files_first, files_second) | view { it }
}
The sayHello
process is only called for two pairs (the size of the smallest directory in fact):
Hello couple (a.txt, A.txt)
Hello couple (b.txt, B.txt)
How to process all possible pairs? Thanks in advance
PS: this question is generic, in my case one of the directory contains only one file.
A process that consumes elements from two independent 'queue' channels will grab one value from each for each execution. The last pair executed is determined by the shorter channel. So this is exactly what you are getting.
What you need to do is to combine
the both channels into a single one that will contains all pairs:
workflow {
def files_first = Channel.fromPath("first/*.txt")
def files_second = Channel.fromPath("second/*.txt")
all_pairs = files_first.combine(files_second)
sayHello(all_pairs) | view { it }
}
Then you need to modify the process to only take to combined channel as input
process sayHello {
input:
tuple path(first), path(second)
...
}