What's the best way to get the valid cpu ids from within a running job?
My idea is to do an allocation --> wrap a docker command with the limits of the allocation --> run nvidia-docker on an remote gpu server.
To limit the docker to the allocation I need to tell it the cpu_ids.
My job submission will look like:
sbatch -o test.txt -c2 -n 10 --mem=10GB --wrap="job that needs the cpu_ids"
Another way (from inside a node) is to parse the SLURM_CPU_BIND_LIST bitmask:
python -c '
import os
s = os.environ["SLURM_CPU_BIND_LIST"]
v = int(s.strip(), base=16)
idxs = [i for i, b in enumerate(reversed(f"{v:0b}")) if int(b)]
print(",".join(f"{x}" for x in idxs))
'
Output (zero-indexed CPU_IDs):
76,77,80,81
Note that these are Slurm's CPU_IDs, which might not be the same as the system's. (But the naive mapping system_cpu_id = slurm_cpu_id might hold.)