How to write a GPU worker pool to run multiple tasks at the same time in bash?

Suppose there are 4 CUDA devices (0,1,2,3) on my computer and there are 10 tasks to run, each tasks is a script named run01.sh, run02.sh, ..., run10.sh.

The problem is, each task use only 1 GPU, I want to write a bash script to run those 10 tasks at the same time to make best use of the 4 CUDA devices. How can I make it?

Update:

To address @RenaudPacalet's question

Please edit your question and explain how you specify which device to use for a given job.

There are 4 workers (4 CUDA devices), what I want is to find out a solution in bash which ensure:

1 worker can only handle 1 task at a time.
there should be no free workers if there are remaining tasks.

Solution

With GNU Parallel it looks like this:

parallel -j4 CUDA_VISIBLE_DEVICES='$(({%} - 1))' {} ::: run*.sh

GNU parallel is a tool for executing jobs in parallel, combined with some shell scripting to manage GPU resources. Here's a detailed breakdown:

parallel: parallel is a command-line driven tool that allows you to execute multiple jobs in parallel. It's particularly useful for running the same command with different arguments or for distributing tasks across multiple cores or nodes.

-j4: This flag tells parallel to use 4 jobs simultaneously. In other words, it will run up to 4 processes at the same time.

CUDA_VISIBLE_DEVICES: This environment variable controls which GPUs are visible to CUDA-enabled applications.

{%} is a special replacement string in parallel. It represents the job slot number, starting from 1.

$(( )) is arithmetic expansion in bash, which evaluates the expression inside.

{%} - 1 subtracts 1 from the job slot number, because CUDA device indices typically start at 0.

So, this sets each job to use a different GPU, based on its job slot number. For example:

Job 1 uses GPU 0
Job 2 uses GPU 1
Job 3 uses GPU 2
Job 4 uses GPU 3
Job 2 finishes first freeing GPU 1
Job 5 uses GPU 1
Job 3 finishes freeing GPU 2
Job 6 uses GPU 2
Job 1 finishes freeing GPU 0
Job 7 uses GPU 0

{} is a placeholder for the argument(s) that parallel will substitute with actual filenames or data.

::: run * .sh specifies that parallel should run the command for each file matching run*.sh in the current directory. This means if you have files like run1.sh, run2.sh, etc., each of these scripts will be executed.