bashshellmultiprocessingshmultiprocess

how to write a process-pool bash shell


I have more than 10 tasks to execute, and the system restrict that there at most 4 tasks can run at the same time.

My task can be started like: myprog taskname

How can I write a bash shell script to run these task. The most important thing is that when one task finish, the script can start another immediately, making the running tasks count remain 4 all the time.


Solution

  • I chanced upon this thread while looking into writing my own process pool and particularly liked Brandon Horsley's solution, though I couldn't get the signals working right, so I took inspiration from Apache and decided to try a pre-fork model with a fifo as my job queue.

    The following function is the function that the worker processes run when forked.

    # \brief the worker function that is called when we fork off worker processes
    # \param[in] id  the worker ID
    # \param[in] job_queue  the fifo to read jobs from
    # \param[in] result_log  the temporary log file to write exit codes to
    function _job_pool_worker()
    {
        local id=$1
        local job_queue=$2
        local result_log=$3
        local line=
    
        exec 7<> ${job_queue}
        while [[ "${line}" != "${job_pool_end_of_jobs}" && -e "${job_queue}" ]]; do
            # workers block on the exclusive lock to read the job queue
            flock --exclusive 7
            read line <${job_queue}
            flock --unlock 7
            # the worker should exit if it sees the end-of-job marker or run the
            # job otherwise and save its exit code to the result log.
            if [[ "${line}" == "${job_pool_end_of_jobs}" ]]; then
                # write it one more time for the next sibling so that everyone
                # will know we are exiting.
                echo "${line}" >&7
            else
                _job_pool_echo "### _job_pool_worker-${id}: ${line}"
                # run the job
                { ${line} ; } 
                # now check the exit code and prepend "ERROR" to the result log entry
                # which we will use to count errors and then strip out later.
                local result=$?
                local status=
                if [[ "${result}" != "0" ]]; then
                    status=ERROR
                fi  
                # now write the error to the log, making sure multiple processes
                # don't trample over each other.
                exec 8<> ${result_log}
                flock --exclusive 8
                echo "${status}job_pool: exited ${result}: ${line}" >> ${result_log}
                flock --unlock 8
                exec 8>&-
                _job_pool_echo "### _job_pool_worker-${id}: exited ${result}: ${line}"
            fi  
        done
        exec 7>&-
    }
    

    You can get a copy of my solution at Github. Here's a sample program using my implementation.

    #!/bin/bash
    
    . job_pool.sh
    
    function foobar()
    {
        # do something
        true
    }   
    
    # initialize the job pool to allow 3 parallel jobs and echo commands
    job_pool_init 3 0
    
    # run jobs
    job_pool_run sleep 1
    job_pool_run sleep 2
    job_pool_run sleep 3
    job_pool_run foobar
    job_pool_run foobar
    job_pool_run /bin/false
    
    # wait until all jobs complete before continuing
    job_pool_wait
    
    # more jobs
    job_pool_run /bin/false
    job_pool_run sleep 1
    job_pool_run sleep 2
    job_pool_run foobar
    
    # don't forget to shut down the job pool
    job_pool_shutdown
    
    # check the $job_pool_nerrors for the number of jobs that exited non-zero
    echo "job_pool_nerrors: ${job_pool_nerrors}"
    

    Hope this helps!