How to pass filenames and parameters from a CSV to a Nextflow process?

I am trying to build a Nextflow pipeline where I need to read a CSV file and use its content to run a Python script. Right now it is just the beginning of the entire pipeline.

My CSV file (objects.csv) looks like this:

Filename,p
file1,12345
file2,51512
file3,67223
...

It contains:

a file name without the file type (1st col) and
a related parameter (2nd col), let's call it p.

The CSV file itself, as well as the filenames in the first column are stored in another folder. At the moment I would like to access them via relative paths.

What I want to achieve is: for each row in the CSV, call a Python script like this:

python3 myscript.py --input_file file1.txt --p 123

My current attempt in Nextflow looks like this:

params.csv = "../../data/objects.csv"          // CSV location
params.loc_abc_files = "../../data/objects/"   // folder with all available .abc files

process runSimulation {
    input:
    tuple path(abc_file), val(p)

    script:
    """
    python3 myscript.py --input $abc_file --p $p
    """
}

workflow {
    
    Channel
        .fromPath(params.csv)
        .splitCsv(sep: ",", header: true)
        .map { row -> tuple(params.loc_abc_files + row.Filename + ".abc", row.Parameter) }
        .set { input_files }

    runSimulation(input_files)

}

Right now Nextflow complains that I have not passed a 'valid Path value'. That might be true since I just build a string inside the .map()-operator. But how do I properly set up this problem, or how do I pass it to the process as a Path value, respectively?

I also want to make sure that the process runSimulation() can be executed in parallel on all the files within the CSV if the resources are available.

Solution

As you said, you're creating a string path in the .map() but it expects a Path.

You can use file() to convert string to Path.

For example:

params.csv = "../../data/objects.csv"
params.loc_abc_files = "../../data/objects/"

process runSimulation {
    input:
    tuple path(abc_file), val(p)

    script:
    """
    python3 myscript.py --input_file ${abc_file} --p ${p}
    """
}

workflow {
    Channel
        .fromPath(params.csv)
        .splitCsv(sep: ",", header: true)
        .map { row -> tuple(file(params.loc_abc_files + row.Filename + ".abc"), row.p) }
        .set { input_files }

    runSimulation(input_files)
}