I am here since I did not across any solution on the internet, yet. I am trying to write a nexflow workflow that basically splits a big table, computes statistics for each split table, and then merges small stats table.
I have some trouble with splitting table process. I want to split table while keeping the header intact in smaller ones. Code for the bash is something like this:
head -n1 '${table2parse}' > header.tsv ## take the header line
tail -n+2 '${table2parse}' | split -l 4 - chunk_ ## split the table w/o headers
for f in chunk_*; do cat header.tsv $f > 'split_table_$f.tsv'; done ## add the header to each chunk
So far this works. However, when I tried to incorporate this into nextflow pipeline:
process splitTable {
input:
path table2parse
output:
path 'split_table_*'
"""
head -n1 '${table2parse}' > header.tsv ## take the header line
tail -n+2 '${table2parse}' | split -l 4 - chunk_ ## split the table w/o headers
for f in chunk_*; do cat header.tsv $f > 'split_table_$f.tsv'; done ## add the header to each chunk
"""
}
I get this error:
Caused by:
No such variable: f -- Check script 'trial.nf' at line: 16
Apparently nextflow confuses bash variable with its own variables. I tried to use escape character '\f' , establishing it as a nextflow variable, but to no avail.
Therefore I am really grateful to anyone with suggestions.
PS: I recently try to learn dsl2 syntax of the Nextflow, if you have recommendations on that, I am all ears!
Reduce the problem to a Bash script taking the required parameter. Test it independently Nextflow process, then call the script from the Nextflow process