suppose I run a slurm job with the following configuration:
#!/bin/bash
#SBATCH --nodes=1 # set the number of nodes
#SBATCH --ntasks=1 # Run a single task
#SBATCH --cpus-per-task=4 # Number of CPU cores per task
#SBATCH --time=26:59:00 # set max wallclock time
#SBATCH --mem=16000M # set memory limit per node
#SBATCH --job-name=myjobname # set name of job
#SBATCH --mail-type=ALL # mail alert at start, end and abortion of execution
#SBATCH --mail-user=sb@sw.com # send mail to this address
#SBATCH --output=/path/to/output/%x-%j.out # set output path
echo ' mem: ' $SLURM_MEM
echo '\n nodes: ' $SLURM_NODES
echo '\n ntasks: ' $SLURM_NTASKS
echo '\n cpus: ' $SLURM_CPUS_PER_TASK
echo '\n time: ' $SLURM_TIME
I want to save the configuration of this job such as 'time, memory, number of tasks' so after the job finished I know under what configuration the job was executed.
So I decided to print these variables in output file, however there is nothing for time and memory in output:
\n nodes:
\n ntasks: 1
\n cpus: 1
\n time:
Does anyone knows a better way? or how to refer to time and memory?
You can dump a lot of information about your job with scontrol show job <job_id>
. This will give you among other memory requested. This will not however give you the actual memory usage. For that you will need to use sacct -l -j <job_id>
.
So, at the end of your submission script, you can add
scontrol show job $SLURM_JOB_ID
sacct -l -j $SLURM_JOB_ID
There are many options for selecting the output od the sacct
command, refer to the man page for the complete list.