[SOLVED] split content of csv file containing path and other fields in nextflow

split content of csv file containing path and other fields in nextflow

I have file containing the following columns:

PathToFile, SomeNumber, SomeString

into a Channel.

How do I open such file, where PathToFile is a file or path type and the other two are val type?

I can open the file as:

    Channel.fromPath(params.list)
        .splitCsv() //{ it.trim() }
        .view(row -> file("${row[0]}"))

and it works grate! But I don't want to view it, I want to USE it in a process. Do I have to convert that to file INSIDE the first process?

Thanks! P.S. What if I want to open a TSV instead of a CSV?

Solution

Having example test.csv file:

path,number,string
file1.txt,1,abc
file2.txt,2,bcd

Not sure if I understood USE it in a process correctly. If yes, your main.nf can look like:

#!/usr/bin/env nextflow
nextflow.enable.dsl=2

samples = Channel
        .fromPath("test.csv")
        .splitCsv(header: true)
        .map { row -> tuple(file(row.path), row.number, row.string) }
        .view()

process echo_channel {

    debug true
    input:
    tuple file(file), val(number), val(string)

    script:
    """
    echo "File name: $file"
    echo "Number: $number"
    echo "String: $string"
    echo "File content:"
    cat $file
    """
}

workflow {
    echo_channel(samples)
}

I included .view() to preview the channel content (requires debug true in process definition).

Now, running nextflow run main.nf result is:

N E X T F L O W  ~  version 23.04.2
Launching `main.nf` [dreamy_leibniz] DSL2 - revision: e6f30e68ff
executor >  local (2)
[a7/10df8f] process > echo_channel (1) [100%] 2 of 2 ✔
[/home/art/test/nf/file1.txt, 1, abc]
[/home/art/test/nf/file2.txt, 2, bcd]
File name: file2.txt
Number: 2
String: bcd
File content:
This is file 2.
File name: file1.txt
Number: 1
String: abc
File content:
This is file 1.

To use tab (or other separator) instead of comma, simply change to .splitCsv(header: true, sep: '\t').