linuxbashparallel-processingenvironment-variablesgnu-parallel

Using GNU Parallel with a variable defined within the terminal session


I am trying to launch an Octave script in parallel using GNU Parallel. Everything works fine, but I have a question regarding exported variables. My workflow before using GNU Parallel was to open a terminal, do export OMP_NUM_THREADS=1, and then execute my Octave script. This way I allocate 1 thread to BLAS, which is used by Octave. When using GNU Parallel, is doing export OMP_NUM_THREADS=1 before using GNU Parallel enough or should I do anything differently? I read about env_parallel but I am not sure whether I need it or not for my use case, and how to use it in case I do.

This is what I do without GNU Parallel (open a terminal and):

export OMP_NUM_THREADS=1
octave--gui

This is what I am doing now with GNU Parallel (open a terminal then):

export OMP_NUM_THREADS=1
readlink -f ./data/*.csv | parallel "octave validation.m {}"

Basically I am trying to process the CSV files within a directory in parallel using validation.m and I would like to make sure BLAS is only using 1 thread.


Solution

  • export variable=value will set variable to value and mark it for exporting to subprocesses. Those include parallel and octave and anything else you run from within that shell (barring corner cases like running env to override what's otherwise in the environment).

    In so many words, the exported variable is visible to all descendants (child processes, and their children, etc) of the environment where it was set.

    Perhaps read up on the Unix process model if you need more details; but this is not very complicated.