I'm trying to perform my first code with Next-flow, im introducing 2 paired reads and I want to execute the bbduk function. I don't know why my code didn't works.
I tryed the following code:
#!/usr/bin/env nextflow
/*
* Pipeline Metagenomics, conda ambient Metagenomics_Nextflow
*/
nextflow.enable.dsl=2 // Nextflow version
params.fq1 = "$HOME/../*{1}.fq.gz" // Can be modified in the script with the --fq1 <value> command
params.fq2 = "$HOME/..//*{2}.fq.gz" // Can be modified in the script with the --fq2 <value> command
/*
* params.out = 'OTU.tsv' // result file
* params.databaseHuman = "$baseDir/Database/GRCh38" //
*/
/*
* Check the quality of raw sequences
*/
process BBduk {
input:
tuple val(meta), path(reads)
tag {meta.id}
output:
tuple val(meta), path('*.fq.gz'), emit: reads
tuple val(meta), path('*.log'), emit: log
script:
def prefix = file("${meta.id}")
def raw = "in1=${reads[0]} in2=${reads[1]}"
def trimmed = "out1=${prefix}_1.fastq.gz out2=${prefix}_2.fastq.gz"
"""
echo(${prefix})
echo(${raw})
echo(${trimmed})
bbduk.sh \\
$raw \\
$trimmed \\
qtrim=r trimq=10 minlen=100 \\
&> ${prefix}.bbduk.log
"""
}
/*
* Workflow
*/
workflow {
sequences = Channel.fromFilePairs( [params.fq1,params.fq2] )
println "Performing Quality control and triming from $sequences"
BBduk(sequences)
}
The code produce the following error in terminal.
ERROR ~ Error executing process > 'BBduk (1)'
Caused by:
No such variable: id -- Check script 'Workflow_Clean_Human_DNA.nf' at line: 34
Source block:
def prefix = file("${meta.id}")
def raw = "in1=${reads[0]} in2=${reads[1]}"
def trimmed = "out1=${prefix}_1.fastq.gz out2=${prefix}_2.fastq.gz"
"""
echo(${prefix})
echo(${raw})
echo(${trimmed})
bbduk.sh \\
$raw \\
$trimmed \\
qtrim=r trimq=10 minlen=100 \\
&> ${prefix}.bbduk.log
"""
Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`
-- Check '.nextflow.log' file for details
The idea of the code is to perform a bbduk algorithm into raw reads, I tried to perform a duple to work. I wan't to do it in next flow because the idea is chain the output with other algorithm
Thanks!
The fromFilePairs
factory method emits tuples in which the first element is the grouping key (and the second element is the list of matching files). It is a simple String (java.lang.String
), not a Map of meta data. I think all you need is something like:
params.reads = './*_{1,2}.fq.gz'
process BBduk {
tag { sample_id }
input:
tuple val(sample_id), path(reads, stageAs: 'reads/*')
output:
tuple val(sample_id), path("${sample_id}_{1,2}.fq.gz"), emit: reads
tuple val(sample_id), path("${sample_id}.bbduk.log"), emit: log
script:
def (fq1, fq2) = reads
"""
bbduk.sh \\
in1="${fq1}" \\
in2="${fq2}" \\
out1="${sample_id}_1.fq.gz" \\
out2="${sample_id}_2.fq.gz" \\
qtrim=r \\
trimq=10 \\
minlen=100 \\
&> "${sample_id}.bbduk.log"
"""
}
workflow {
reads = Channel.fromFilePairs( params.reads )
BBduk( reads )
}
Note that by staging the files in a subdirectory (using stageAs: 'reads/*'
), we ensure we don't accidentally clobber our input files in the working directory.