I have the following files in my folder
11111-chr1A1.txt
11111-chr1C2.txt
11111-chr1D3.txt
11111-chr114.txt
11111-chr10A1.txt
11111-chr10-C2.txt
11111-chr1003.txt
11111-chr10-4.txt
And I need to feed them into parallel by chr{number} as chunks. chr{number} should be exact pattern matching, which is what is causing problems in this case
for example
parallel "echo {1}"
should output
11111-chr1A1.txt
11111-chr1C2.txt
11111-chr1D3.txt
11111-chr114.txt
Then the second chunk:
11111-chr10A1.txt
11111-chr10-C2.txt
11111-chr1003.txt
11111-chr10-4.txt
I tried:
for i in {1..10}
do
parallel "echo {1/}" ::: *chr"$i"*txt
done
Which always outputs all files at the same time because the pattern chr1 and chr10 are superimposed
If needed, creating a CSV file beforehand is ok, for example defining the first column of the csv file as all chr1 then the second column as chr10 files then feeding it into parallel per column
I assume you want longer matches to take precedence over shorter matches.
# Split files into groups - each in their own dir
# -j1 is important to force *chr10* be moved before *chr1*
parallel -j1 'mkdir -p out/{}; mv *{}* out/{}' ::: chr{10..1..1}
do_group() {
cd "$1"
parallel echo ::: *
}
export -f do_group
# Run each dir seperately
parallel --tag do_group ::: out/*