cluster-computingslurmsungridenginelsf

Why are repetitive calls to squeue in Slurm frown upon?


Why is it not recommended to run squeue in a loop to avoid overloading Slurm, but no such limitations are mentioned for the bjobs tool from LSF or qstat from SGE ?

The man page for squeue states:

PERFORMANCE

Executing squeue sends a remote procedure call to slurmctld. If enough calls from squeue or other Slurm client commands that send remote procedure calls to the slurmctld daemon come in at once, it can result in a degradation of performance of the slurmctld daemon, possibly resulting in a denial of service.

Do not run squeue or other Slurm client commands that send remote procedure calls to slurmctld from loops in shell scripts or other programs. Ensure that programs limit calls to squeue to the minimum necessary for the information you are trying to gather.

which to my understanding disapproves the use of e.g. watch squeue. Such a warning is commonly found in site-specific documentation, e.g. here:

Although squeue is a convenient command to query the status of jobs and queues, please be careful not to issue the command excessively, for example, invoking the query for the status of a job every five seconds or so using a script after a job is submitted.

In comparison, I could find no such warning for similar tools on other engines e.g. qstat or bjobs. I see people using all of these tools in a repetitive fashion without distinction, e.g. here for squeue, here for bjobs.

The quote above from Slurm documentation mention a RPC, is it a way of doing different from other engines ? Is there an architecture difference between Slurm and other grid engines that makes querying the status of all jobs more costly ?


Solution

  • Actually the concern about running squeue too quickly often originates more from cluster administrators than developers. In this particular case, looking at the commit message of that specific section of the documentation, we learn that it was actually requested by a customer of SchedMD, so most probably an entity running a production cluster.

    The criticality of that advise increases with the size of the cluster and the job turnover. On a 10-node cluster running on average 5-6 jobs per day, from a dozen users you will be find hitting the slurm controller with many squeue requests. But on a 4000-nodes, 10000 users, 10k jobs/day, you might interfere in a visible way with Slurm performances.

    I have seen at least one site that overwrote the qstat command with a rate-limiting version based on cached information.

    From a technical point of view, RPC is what most of the alternatives use.