I have a question about channel.fromFilePairs() . I have the following nextflow script:
params.reads = "/path/to/my_reads/sample03_L001_R{1,2}_001.fastq.gz"
my_reads_ch = channel.fromFilePairs(params.reads)
println "reads: $my_reads_ch"
The script prints [sample03_L001_R, [/path/to/my_reads/sample03_L001_R1_001.fastq.gz, /path/to/my_reads/sample03_L001_R2_001.fastq.gz]].
The desired output is [sample03, [/path/to/my_reads/sample03_L001_R1_001.fastq.gz, /path/to/my_reads/sample03_L001_R2_001.fastq.gz]].
How do I remove the "_L001_R"?
I've tried
channel.fromFilePairs(params.reads).map{it[0] - /_\w+/, it[1]}
That gives me an ERROR: Unknown method invocation 'negative' on Pattern type.
Any suggestions? Many thanks
You just need to use the tilde operator to first create a pattern object:
Channel
.fromFilePairs( params.reads )
.map { sample, reads -> tuple( sample - ~/_\w+$/, reads ) }
.view()
A better way though would be to better define your initial glob pattern, and let the fromFilePairs
operator strip off the suffix for you. For example:
params.reads = "/path/to/my_reads/*_L001_R{1,2}_001.fastq.gz"
Channel
.fromFilePairs( params.reads )
.view()
Results:
$ nextflow run main.nf
N E X T F L O W ~ version 24.04.3
Launching `main.nf` [golden_wescoff] DSL2 - revision: f7979e483d
[sample01, [/path/to/my_reads/sample01_L001_R1_001.fastq.gz, /path/to/my_reads/sample01_L001_R2_001.fastq.gz]]
[sample03, [/path/to/my_reads/sample03_L001_R1_001.fastq.gz, /path/to/my_reads/sample03_L001_R2_001.fastq.gz]]
[sample02, [/path/to/my_reads/sample02_L001_R1_001.fastq.gz, /path/to/my_reads/sample02_L001_R2_001.fastq.gz]]